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Preface 


The field of Computer Graphics has evolved rapidly over the past decade following 
the development of a large collection of algorithms and techniques for various appli- 
cations in modelling, animation, visualisation, real-time rendering and game engine 
design. Advances in graphics hardware capabilities and processor technology have 
continuously fuelled this growth. As a result, this field continues to have enormous 
potential for further research and development. Computer graphics has also been 
one of the popular subjects in the computer science and computer engineering 
disciplines for several years. It is a field where one could always find new and 
interesting ideas, elegant algorithms and robust implementations. 

I have been teaching both introductory and advanced courses on computer 
graphics for the past 12 years, and have constantly observed the enthusiasm 
of students in learning as well as mastering various techniques used for three- 
dimensional modelling, rendering and animation. The visual effects some of these 
methods produce captivate their interest, and motivate them to further study and 
research more advanced techniques. This book evolved from a compilation of my 
lecture notes and reference material for a graduate course in advanced computer 
graphics taught in the Department of Computer Science and Software Engineering 
at the University of Canterbury. The primary aim of this book project has been 
to develop a reference text suitable for both students and researchers, providing 
an in-depth and comprehensive coverage of important methods that are useful 
in the field of character animation. Working towards this goal, I soon realised 
that a book covering a large number of subfields ranging from physically based 
simulation to non-photorealistic rendering would be a highly ambitious project. This 
book includes a selection of topics which I consider as fundamental to the area of 
animation and rendering, and I hope that it will contribute to a deeper and broader 
understanding of key algorithms used in advanced computer graphics. 

I am very much indebted to the graduate students and staff in the Department 
of Computer Science and Software Engineering, University of Canterbury, for 
their support, valuable feedback, and encouragement. My sincere thanks go to 
Dr. Richard Lobb (Adjunct Senior Fellow, Department of Computer Science and 
Software Engineering, University of Canterbury) for devoting so much of his 
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valuable time and expertise for reviewing the manuscript. I am thankful to Dr. 
Christian Long (Department of English, University of Canterbury), for copy-editing 
the manuscript. His thorough and meticulous checking of spelling, punctuation and 
grammar has helped improve the clarity of the material presented. 

I would like to thank the editorial team members for their help throughout this 
book project. While the manuscript was being prepared, a series of unfortunate 
events, including the passing away of my mother, and two major earth quakes in 
Christchurch, brought the progress to a standstill for several months. Special thanks 
to Helen Desmond and Beverley Ford for their continuous encouragement. They 
showed a tremendous amount of patience, and always so kindly agreed to extend 
the manuscript submission deadline a number of times. 

I am very grateful to my family for their endless support. I greatly appreciate their 
patience and understanding throughout the time when I was obsessed with writing 
this book. 


Department of Computer Science R. Mukundan 
and Software Engineering 

University of Canterbury 

Christchurch, New Zealand 
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Chapter 1 
Introduction 


1.1 Advanced Computer Graphics 


Computer graphics algorithms are being increasingly used in many scientific and 
technological areas, with an explosive growth in applications requiring three- 
dimensional rendering and animation. The expansion of computer graphics into 
diverse and interdisciplinary areas is the result of many factors such as the ever 
increasing power and capability of the graphics hardware, decreasing hardware 
costs, availability of a wide range of software tools, research advancements in the 
field, and significant improvements in graphics application programming interface 
(API). Additionally, vast amounts of resources including images, 3D models, and 
libraries are now easily available to developers and researchers for their work. With 
the emergence of programmable graphics hardware, the power of graphics APIs to 
render complex models and scenes has greatly increased, and it has become easier to 
create faster and robust implementations of several advanced algorithms. Following 
these developments, there is also an increasing need for reference books that give 
an in-depth coverage of advanced methods that are fundamental to many application 
domains. 

Advanced computer graphics is a field that encompasses a vast range of topics 
and a large number of subfields such as game engine development, real-time 
rendering, global illumination methods and non-photorealistic rendering. Indeed, 
this field includes a large body of concepts and algorithms not generally covered in 
introductory graphics texts that deal primarily with basic transformations, projec- 
tions, lighting, three-dimensional modelling techniques, texturing and rasterization 
algorithms. 

This book aims to provide a comprehensive treatment of the theoretical concepts 
and associated methods related to four core areas: articulated character animation, 
curve and surface design, mesh processing, and collision detection. The area of 
character animation is further subdivided into scene graphs, skeletal animation, 
quaternion rotations and kinematics. A principal objective of this book is to serve as 
areference text for both students and researchers. Itis designed for courses that build 
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upon introductory computer graphics concepts. The topics discussed in the book are 
commonly covered in graduate or advanced undergraduate graphics courses. These 
include the theoretical as well as the implementation aspects of several algorithms. 
To help students understand the concepts clearly, a set of demonstration programs 
is included with each chapter. Necessary class libraries giving the implementations 
of important methods of each class are also provided. Some of the concepts that 
have recently found a great deal of importance in research such as dual quaternion 
transformations, and bounding interval hierarchies are also presented. 


1.2 Supplementary Material 


Each chapter is accompanied by a collection of software modules and demonstration 
programs that show the details and working of key algorithms. All programs are 
written in C++. The reader is assumed to be familiar with the basic OpenGL 
library, which is a easy-to-program, widely accepted cross platform API for devel- 
oping graphics applications. To keep the implementations simple, shader language 
functions or any other OpenGL extensions are not used. The source codes including 
relevant class definitions and input files can be downloaded from Springer's website, 
http://extras.springer.com/978-1-4471-2339-2. 

The programs are written entirely by the author, with the primary aim of 
motivating students to explore further each technique, and to implement their own 
creative ideas. They are just tools which developers and researchers could use to 
build larger frameworks or to try better solutions. A simple programming approach 
is used so that students with minimal knowledge of C/C++ language and OpenGL 
will be able to start using the code and work towards more complex or useful 
applications. None of the software is optimized in terms of algorithm performance 
or speed. Similarly, object oriented programming concepts are not heavily used, 
leaving room for a lot of further development. 


1.3 Notations 


In order to have a clear distinction between points, vectors and other mathematical 
entities, the following notation is normally used in this book. Note that in excep- 
tional cases, a different notation may be used in each of the following categories to 
avoid ambiguity. For example, a tangent vector to a curve may be denoted by T(t) 
instead of f(t). 


Point: A pointis generally denoted by an uppercase letter in italics as P. The three- 
dimensional coordinates of P will be written as (xy, yp, Zp). The vector representation 
of P having the same components as above will be denoted as p. The coordinates 
of the point P, will be written as either (xy, Yp1, Zp1) or, if there is no ambiguity, as 
simply xi, yi, zi). 
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Vector: A vector will be denoted by a lowercase letter in italics and bold font as v. 
Its vector components will be noted as (xy, Yy, Zy). 


Complex number: Complex numbers are treated as two-dimensional vectors and 
denoted using a lowercase letter in italics and bold font as z. 


Quaternions: Uppercase letters in italic font (such as Q) will be used to denote 
quaternions. Dual-quaternions will be denoted using uppercase letters in bold and 
italic font as Q. 


Line segment: A line segment will be noted using its end points as AB. 


Triangle: A triangle will be denoted using its vertices as ABC and its area as 
AABC. A triangle may also be named using an uppercase letter in italics as T. 


Plane: Uppercase Greek symbols such as I’, IT, will be used for denoting planes 
and general polygonal surface elements. 


Matrices: Matrices will be denoted using uppercase letters in bold font as M. 


1.4 Contents Overview 


This section gives an outline of subsequent chapters of the book. Chapter 2 should 
be treated as revision material on analytical properties of geometrical primitives and 
may be skipped if you have a good mathematical background. Chapters 3, 4, 5, 6 
are closely related to the area of character animation. Chapters 7, 8, 9 deal with 
mutually independent topics, and can be read separately in any order. 


Chapter 2 — Mathematical Preliminaries: This chapter outlines important math- 
ematical concepts related to points, vectors, transformations, lines and planes that 
are fundamental to several methods in computer graphics. Subsequent chapters in 
the book make use of the results presented here. 


Chapter 3 — Scene Graphs: This chapter introduces scene graphs and gives 
examples to show their importance in representing transformation hierarchies in 
articulated models. A sample implementation of the basic scene graph structure is 
provided. 


Chapter 4 — Skeletal Animation: This chapter discusses the animation of two 
different types of articulated character models. The processes of vertex blending, 
vertex skinning and keyframing are introduced. The chapter also gives a sample 
implementation of a skeleton animation module. 


Chapter 5 — Quaternions: Quaternions are extensively used in animations to 
represent three-dimensional rotations. This chapter gives a comprehensive coverage 
of quaternion algebra, transformations and quaternion based methods for rotation 
interpolation. A recently introduced concept of dual quaternions is also presented. 
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Chapter 6 — Kinematics: This chapter presents forward and inverse kinematics 
solutions for animating a joint chain. Iterative algorithms suitable for graphics 
applications are also presented. 


Chapter 7 — Curves and Surfaces: This chapter gives an in-depth treatment of 
parametric curves, splines and polynomial interpolants. Fundamental techniques in 
curve and surface design using Hermite splines, cardinal splines and B-splines are 
presented in detail. 


Chapter 8 — Mesh Processing: This chapter discusses mesh data structures 
and algorithms. Important edge-based data structures useful for processing adja- 
cency queries are introduced. Algorithms for mesh simplification, subdivision and 
parameterization are presented. The chapter also outlines methods for polygon 
triangulation, which is generally a key component of mesh processing algorithms. 


Chapter 9 — Collision Detection: This chapter details commonly used bounding 
volume representations of objects in collision detection algorithms, and presents the 
computation of bounding volume overlap tests. Bounding volume hierarchies and 
spatial partitioning trees are also discussed in detail. 


Chapter 2 
Mathematical Preliminaries 


Overview 


Mathematical operations on points, vectors and matrices are needed for processing 
information related to geometrical objects. Even in the modelling of a simple three- 
dimensional scene, vectors and matrices play an important role in specifying an 
object's position, orientation and transformations. Methods for lighting, intersection 
testing, projections, etc., use a series of vector operations. This chapter gives an 
overview of computations using geometrical primitives and shapes that form the 
basis for several algorithms presented in subsequent chapters of the book. 

Parametric representations are often used in methods involving geometrical 
primitives. This chapter deals with analytical equations of lines, planes and curves, 
and their applications in geometrical computations. Properties of three-dimensional 
transformations are discussed using their matrix representations. The chapter also 
introduces concepts such as signed area and distance, affine combinations of points 
and barycentric coordinates. 


2.1 Points and Vectors 


A point is the most fundamental graphics primitive, and is represented in a three- 
dimensional Cartesian coordinate system by the 3-tuple (x, y, z), where x, y, z 
denote the distances of the point from the origin of the system along the respective 
axes directions. In graphics, we commonly use an extended coordinate system, 
where the same point is denoted by the 4-tuple (x, y, z, 1). This representation is 
called the homogeneous coordinate system. Homogeneous coordinates provide a 
unified and elegant framework for representing different types of transformations 
and projections that are commonly applied to both points and vectors (Box 2.1). 
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Box 2.1 Homogeneous Coordinate System 


A 3D point given by homogeneous coordinates (a, b, c, d) where d is non- 
zero, has an equivalent representation in Cartesian coordinates given by (a/d, 
b/d, c/d). 

The 4-tuple (a, b, c, 0) denotes a point at infinity that has associated with it a 
directional vector (a, b, c). 

The many-one mapping from homogeneous to Cartesian space is shown 
below: 


(hx, hy, hz, h) = 3D Point (x, y, z) for all non-zero values of h. 
(x, y, z, w) = 3D Point (x/w, y/w, z/w) if w Æ 0. 
(x, y, z, 0) = 3D Vector (x, y, z). 


Z 


Fig. 2.1 Geometric interpretation of (a) subtraction of a point from another, (b) addition of two 
points given in homogeneous coordinates, and (c) addition of two vectors 


We will now look at the geometrical interpretations of operations of addition and 
subtraction on homogeneous coordinates. When we subtract a point Q = (x4, Yq, 
Zq, 1) from the point P = (xy, Yp, Zp, 1), we get a vector P—Q which has components 
(Xp—Xq, Yp—Yq» Zp—z4, 0). This vector originates from the point Q and is directed 


towards the point P, and is denoted as QP. The direct addition of two points P 
and Q is not a geometrically valid operation, as it can produce different results 
depending on the coordinate reference frame used. If we use the homogeneous 
coordinate representation of P and Q as given above, the operation of addition yields 
(Xp + Xg; Yp + Yq. Zp + Zq» 2), which is actually the midpoint of the line segment 
PQ (Fig. 2.1b). Points can, however, be added in a special way called the affine 
combination (see Sect. 2.7) that gives a well-defined point. The addition of two 
vectors p = (Xp, yp, Zp, 0) and q = (x4, Yq» Zq» 0) is always a valid operation that 
produces another vector p + q = (Xy + X4, yp + yq» Zp + Zq, 0). This vector is along 
the diagonal of the parallelogram formed by p and q. 
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a 

u*v— cos0 

luxv|7 2(AABC) / 
, 


uxy 


Fig. 2.2 (a) Dot-product and cross-product of two vectors u,v. (b) Projection of a vector s on a 
unit vector u. (c) Reflection of a vector s with respect to a unit vector n 


Fig. 2.3 The normal vector C 
and area of a triangle 

specified using vertex y 
coordinates can be computed 

with the help of two vectors 

defined along the edges A u 


Like addition, the operations of negation and scalar multiplication should also 
be carefully performed on points represented in homogeneous coordinates. It can 
be seen that the operation of negation given by —P = (—xp, —yp, —Zp, — 1) in effect 
yields the same point P. In general, the operation of scalar multiplication defined as 
SP = (SXp, Syp, SZp, S) for any non-zero value of s, gives the same point P. 

We will often require the computation of angles between two vectors. This and 
other operations, such as projection, require vectors to be normalized first. The 
normalization of a vector is the process of converting it to a unit vector that has 
a magnitude 1. In order to normalize a vector p = (xy, Yp, Zp, 0), we simply divide 
each element by the vector magnitude d given by 


d —|p| = Jx} + yictz (2.1) 


If v is a two-dimensional vector (x,, y,), then the vector yt = (—yy, Xy) is 
perpendicular to and on the left side of v. The vector v^ is sometimes called the 
perp-vector. It may be noted that v+ = (—x,, —y,) = —v. 

Two important vector operations used in graphics are the dot-product and the 
cross-product. Given two unit vectors U = (Xu, Yu, Zu, 0) and v = (xy, yy, Zv, 0), their 
dot-product u*v = x,x, + yuy, + zuz, is equal to the cosine of the angle between 
the vectors. The cross-product u x v = (yuzy — yvZu. ZuXy — ZvXu, XuYy —XvVu, 0) is a 
vector perpendicular to both u and v, so that u, v, u x v form a right-handed system 
(Fig. 2.2). Obviously, this operation is useful for computing the surface normal 
vector of a planar element defined by two vectors u and v. The magnitude of u x v 
(denoted by |u x v|) gives twice the area of the triangle formed by the two vectors 
(Figs. 2.2a and 2.3). For unit vectors, |u x v| is also equal to the sine of the angle 
between the two vectors (Box 2.2). 
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Box 2.2 Vector Products 


The following facts are commonly used in computations involving vectors: 


If u is a unit vector, then ueu = 1. 
If u is perpendicular to v, then u*v = 0. 
If u is parallel to v, then u x v = 0. In particular, u x u = 0. 
The magnitude of u x v is the area of the parallelogram formed by u, v. 
The scalar triple product u*(v x w) gives the volume of the parallelepiped 
formed by the vectors u,v and w. The value does not change with a cyclic 
permutation of the vectors: u*(v x w) = ve(w x u) = we(u x v). 
Xu Yu Zu 
u*(v x w) can be written as the determinant | x, y, zy 
Xw Yw Zw 
The vector triple product u x (v x w) is the same as (uew)v — (u*v)w. 
The magnitudes of the dot and cross products of two vectors u and v are 
related by the equation: |u x v? = |u|?|v|? — (uev). 


We saw in the previous paragraph that both the dot and the cross products of 
two unit vectors can give us the information about the angle between them in the 
form of trigonometric functions cos () and sin() respectively. Note that the 
function acos(u*v) returns the angle in the range [0, x] only. Neither can we 
use asin(|u x v|) to determine the angle correctly because the resulting value will 
always be in the restricted range [0, 1/2] (even though asin() returns a value in 
the range [—7/2, 1/2], since |u x v| is always positive, so would be the result). We 
will explore ways to compute the true angle in the range [—x, x] in Sect. 2.2. 

If we represent the vertices of a triangle by points A = (Xa, Ya» Za), B = (Xp, yp. Zb)» 
C = (Xe, ye, Zc), the surface normal vector and the area of the triangle can be obtained 
from the cross product of two vectors u, v constructed as shown in Fig. 2.3. 

The normal vector n of the triangle in Fig. 2.3 has components (Xn, yn, Zn) 
given by 


Xn = Ya (zp — Ze) F yb (Ze = Za) + Ve (Za = Zp) 
Yn = Za (Xp — Xe) + zb (Xe — Xa) + zc(Xa — xp) 
Zn — Xa (yp = Ve) + Xp (Ve = Ya) + Xc (Ya zx yb) (2.2) 


The above vector is the same as u x v. The area of the triangle ABC can be 
computed from the above components of the normal vector as follows: 


1 1 
AABC = 5 Xs + Yn tim = 5 lu xl (2.3) 
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Let us turn our attention to another important vector operation called projection. 
A vector s can be projected onto a unit vector n, with the projected vector given 
by (sen)n (see Fig. 2.2b). This also implies that the length of the projection of s on 
a unit vector n is sen. We can use this fact to express any vector s in terms of its 
projections along three mutually orthogonal unit vectors u,v, and w as 


s = (seu)u + (s ev)v + (s ew)w (2.4) 


If s is also a unit vector, then the terms seu, sev, sew are called the direction 
cosines of the vector in the coordinate space spanned by the unit vectors u, v, and 
w. In a new coordinate space defined by u, v, and w, the components of any vector 
s are therefore given by (seu, sev, sew). 

The reflection of the vector s with respect to a unit vector n is the vector r that lies 
on the plane containing s and n as shown in Fig. 2.2c, such that the angle between r 
and n is the same as the angle between s and n. The reflection vector is commonly 
used in lighting calculations and ray tracing, where s stands for the vector towards a 
light source, and n is the surface normal vector. The vector components of r can be 
computed using the formula 


r —2(sen)n—s (2.5) 


2.2 Signed Angle and Area 


In the previous section, we noted that the computation of the angle between two 
vectors using acos () or asin() functions always yielded only positive values 
in the range [0, x]. One may suggest using the function atan2(|u x v|, uev). This 
form of computation of angle has the advantage that neither u nor v needs to be 
normalized. However, this function also returns values in the positive range [0, x] 
only, because the numerator |u x v| is always positive. The difference between the 
positive and negative sense of angle is completely view dependent. For vectors 
residing on the two-dimensional xy-plane, the direction to the viewer is always 
implied to be the 4- z direction. In a general three-dimensional case, we need to 
specify this view direction in order to determine the signed angle in the range 
[—1, x] between two given vectors. 

If we denote the view direction by w (Fig. 2.4), the angle measured from u to 
y is positive if the sense of rotation from u to v is anticlockwise when viewed 
from w. In other words, if w is in the same direction as u x v, then the angle is 
positive, otherwise negative. We can now define the signed angle between u and v 
with respect to the view vector w as 


0 = sign((u x v) ew).cos ! (; ° ”) (2.6) 


Iu||v| 
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For this view direction, both 
angle and area are negative. 


For this view direction, both 
angle and area are positive. 


Fig. 2.4 The angle between two vectors and the area of the triangle formed by the vectors can 


have either a positive or a negative sign depending on the orientation of the vertices with respect to 
à given direction 


If u and v are two-dimensional vectors on the xy-plane, we can have the following 
simplified form for the signed angle: 


d= atan2(Xx,yyy — XyYus XuYu + Xyyy) (2.7) 
We can also define a view-dependent sign for the area of a triangle based on the 


above concept. If the view vector w has components (Xw, Yw, Zw, 0), Eq. 2.3 now gets 
modified as follows: 


: 1 
AABC = Ísign(x, Xy + Yn Yw + ZnZw)} (29 +y? + 4) 


sign(n e w) G ju x «) (2.8) 


II 


where Xn, Yn, Zn are computed from the vertex coordinates using Eq. 2.2. 

For a triangle on the xy-plane, the right-hand side of the above equation reduces 
to z,/2. Thus the signed area of a triangle with vertices A = (xa, Ya), B = (xp, yp), 
C = (xc, yc) iS 


1 
AABC — B (Xa (Yb — Ye) + Xb (Ye — Ya) + xc (Ya — Yo)) (2.9) 


The signed area is positive only if the vertices A, B, C are oriented in an 
anticlockwise sense with respect to the view direction. The signed area of a triangle 
is useful in determining if a point is inside the triangle or not. This method is 
discussed in detail in Sect. 2.8. The concepts presented above are also used for 
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defining the orientation of three points. Three points A, B, C are said to be oriented 
in the anticlockwise sense with respect a direction w if 


((B — A) x (C — A) ew > 0. (2.10) 


If the above condition is satisfied, the three points are said to make a left turn 
when viewed from the direction w. With reference to Fig. 2.4, the equivalent 
condition in vector notation is (u x v)ew > 0. On the xy-plane, the three points make 
a left turn if 


Xa (Yb — Ye) + Xb Ve — Va) + Xc(ya = ys) > 0. (2.11) 


The reversal of the inequality implies a right turn. The points are collinear if 
the above expression yields 0. In the next section we will use vector notations and 
related operations to get concise forms of line and plane equations. 


2.3 Lines and Planes 


Lines and planes form integral parts of three-dimensional models and virtual worlds. 
A good understanding of line and plane equations and their analytical properties is 
essential for the development of many applications. For example, even a simple ray 
tracing application requires the computation of several line-plane intersections. 

A straight line segment can be defined using two points, say P = (xy, Yp, Zp, 1) 
and Q = (x,, ys, Zq, 1). The equation of this line in terms of a single parameter t can 
be expressed as 


X = Xp + i(xq — Xp); y = Yp t(ya — Yp); z = Zp + t(z4 — Zp) (2.12) 


For any value of t between 0 and 1, the above set of equations gives the 
coordinates of a point on the straight line that lies between P and Q. We can also 
write the equation of this line segment using vector notation as follows: 


r=pc+tm, O0<t<l. (2.13) 


where r = (x, y, z, 1), p = (Xp, yp, Zp, 1) and m = Q—P. The above equation can also 
be used to represent a ray starting from the point p and having a direction given by 
the vector m. In this representation, m is generally a unit vector and ¢ can have any 
positive value. The line given in Eq. 2.12 can be rewritten in the standard form by 
eliminating t: 
X—Xp _ Y—Yp _ 27% (2.14) 
Xq— Xp = yq Yp — %q ~ Sp 
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Fig. 2.5 Computation of 
shortest distances of a point V 
from (a) a line PQ and (b) a 
plane POR 


From the above equation, we immediately get the condition for the collinearity 
of three points P = (xy, Yp, Zp, 1), Q = (Xq, Yq» Zg» 1) and R= (xy, Yr, Zr, 1): 
Xp — Xp Yr — Yp Zr — Zp 


= = (2.15) 
Xq—Xp Yq Yp | X%q ~ Sp 


Using Eq. 2.12, we can determine the point S on the line PQ that lies closest to 
a general three-dimensional point V = (xy, yy, Zv, 1). The shortest distance of the 
point V from the line is given by VS (Fig. 2.5), where S is the projection of the point 
V on PQ. The point S satisfies the condition that the line segments PQ and VS are 
orthogonal to each other. Using this condition, the parametric value ¢ of the point S 
can be obtained as follows: 


(x, — Xp) Xo — Xp) + Yv — Yp)Cya — yp) + Gv — Zp) za — Zp) 


= 2 2 2 
(Xq — Xp) "p (Ya — yp) F (2q —Zp) 


(2.16) 


Substitution of the above value in Eq. 2.12 gives the coordinates of the point S. 
The shortest (or the perpendicular) distance D of the point V from the line PS is 
obtained as the distance |V—S]. 

A plane in three-dimensional space is uniquely defined by three non-collinear 
points, or equivalently, by a point P that lies on the plane and its surface normal 
vector n. The equation of the plane in terms of the coordinates of the three points 
P= (xy, Yp, Zp, 1), Q = (Xq; yq» zq 1), R= (Xr, Yr, Zr, 1), is given by the determinant 


x y zl 
Pa Na aN gy (2.17) 
Xq Yq %q l 
Xr yr Zr 1 


From this equation of the plane, we get the condition for the coplanarity of four 
points P, Q, R, S: 


Xp Yp Zp 
Xq Yq %q 
Xr Yr Zr 
Xs Ys Zs 


=0. (2.18) 


Se eS eS 
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The determinant is equivalent to (P—Q)*(r x s) + (R—S)*(p x q). The condition 
in Eq. 2.18 also points to the fact that the vectors (Q—P) and (R—S) are coplanar. 
Thus we can rewrite the above equation using the following scalar triple product: 


(R— P)et((Q— P) x (S- R} =0. (2.19) 
The surface normal vector n for the above plane can be obtained (similar to 


Eq. 2.2), by taking the cross-product of vectors Q—P and R—P. The components 
of n written as a column vector are given below: 


Xn (Yq — yp) — Zp) — Or — yp)(z4 — Zp) 

Yn | _ Ca — Zp) (Xr — Xp) — (zr — Zp)(X%q — Xp) 2.20 
| | Gy m0 99 — Ge 39 0d 339 inn 
0 0 


The plane equation can be written in point-normal form as 
(x — Xp)Xn ar (y = Yp)Yn +z- Zp)Zn =0 (2.21) 


which can always be simplified into a linear equation ax + by+cz+d=0, or 
expressed using vector notation as 


(r —p)en —0, _ or equivalently, ren = —d, (2.22) 


where d = —pen. The point of intersection of this plane and a ray can be obtained by 
substituting the equation of the ray, r = q + tm, in the above equation and solving 
for t. 


| —(qen)-d 


men 


t (2.23) 


The denominator in the above equation becomes zero when the line is orthogonal 
to n, i.e., parallel to the plane. The shortest distance D of the point v from the plane 
(see Fig. 2.5b) is given by the equation 


(Xy — Xp)Xn + (yy — Yp)Yn + (£v — Zp)Zn = (v-n)+d 
VX Ty |n | 


The above term is also called the signed distance of the point v from the 
plane, as it assumes a positive value if v is on the same side as n, and a negative 
value otherwise. In general, if the plane's equation is given in the normal form 
ax 4- by - cz4- d —0, where à? - P? -- c? — 1, the signed distance of the point 
y = (Xy, yy, zy) is given by 


D- (2.24) 


D — ax, by, ez, +d (2.25) 
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r=P +su+ tv 


Fig. 2.6 Two-parameter representation of a plane 


The above expression can be thought of as the dot product between the vector 
(a, b, c, d) and (xy, yy, Zv, 1), which is the homogeneous representation of v. Note 
that the unit normal vector to the plane is given by (a, b, c). Signed distances are 
extensively used in collision detection and point inclusion tests using bounding 
volumes. 

Given three non-collinear points P, Q, R, we can have a parametric representation 
of the plane through the points as 


r — P Fs(Q—P) -t(R—P)— Pc suctv (2.26) 


where u and v are vectors along two sides of the triangle POR (Fig. 2.6). An 
alternate form for the above equation that expresses any point on the plane as a 
linear combination of the vertices of the triangle is 


r 2 P(1—s—t) s Q-^tR (2.27) 


For every point r(s, t) inside the triangle, the following properties hold: 
0<s<1, 0<t<1, O0<s+t<l. (2.28) 


In addition to the above conditions, points along the edge PQ satisfy the 
parametric equation t= 0. Similarly, the edge PR is characterized by the equation 
s = 0, and RQ by the property s+ t= 1. 


2.4 Intersection of 3 Planes 


An interesting problem commonly encountered while working with planes is the 
computation of the point of intersection (if it exists) where three planes meet. 
Even if it is guaranteed that no two planes are parallel, there can be three different 
configurations in which three planes can meet (Fig. 2.7). 
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Fig. 2.7 Three different configurations in which three non-parallel planes can meet 


In the first configuration in Fig. 2.7, the lines of intersection formed by taking 
two planes at a time coincide with the result that we get a single line of intersection. 
In the second configuration, the lines of intersections are parallel even though the 
planes are not. It can be easily proven that if two lines of intersection are parallel, 
then the third is also parallel to the other two. This situation arises when the three 
surface normal vectors of the planes are all coplanar. In the third configuration, the 
non-parallel lines of intersection meet at a single point. 

Let the three planes be given by the equations (see Eq. 2.22) ren; = —dj, (i= 1,2, 
3) where n;s are unit normal vectors. The directions of the three lines of intersection 
are then specified by the cross products n, x n2, n» x ns, and n3 x nı. The point 
of intersection, if it exists, can be expressed as a linear combination of these three 
vectors (Goldman 1990): 


p — a(n, x na) + b(m x n3) + c(n3 x ni) (2.29) 
The above point lies on all three planes. Substitution in the plane equations gives 
bin; e (nz x n3)) = —di 


c{n (n3 x nj)) = —d» 


aína e (nj x n3)) = —d3 (2.30) 


The scalar triple products on the left side of the above equations are all equal (see 
Box 2.2). Equation 2.29 can now be written as 


" —d, (nı x n2) — do(n» x n3) — d3(n3 x ni) (2.31) 
nı © (n2 x na) 


For the first two configurations shown in Fig. 2.7, the vectors nı, n2, n3 are 
coplanar, and the denominator of the above equation becomes zero. For the third 
configuration, the equation returns a valid point. 
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In Sect. 2.3, we came across the equation of a straight line expressed in terms of 
linear polynomials of a single parameter t (Eq. 2.12). Polynomials of a higher degree 
in f can be used to define curves in three-dimensional space. In the most general 
form, a curve can be represented as P(t) = (x(t), y(t), z(t), where x(t), y(t), z(t) are 
continuous and differentiable functions of the parameter t. Polynomials of degree 
n have the property that their derivatives up to order n—1 exist and are continuous 
over any finite interval in the parameter space. We can use the derivatives of the 
functions to define the tangential and normal directions to the curve at any point, 
and also to construct an orthonormal basis at any point on the curve. 

The tangent vector at P(t) is given by the first derivative with respect to t, i.e., 
P'(t) = (x (0), y (®©, z (©). The unit tangent vector is denoted as 


P'(t) 


TH = 
O= PO 


(2.32) 


The tangent vector represents the local orientation of the curve at a point. If 
the parameter f denotes time, then P'(f) represents the instantaneous velocity of 
the moving point P(t). The distance travelled from a starting point A = P(fo) to the 
current point, or in other words the arc length measured from A, is given by 


s(t) = i |P'Q)| du = i VG'G)Y - Q'G - Cay du 233) 


Using the above equation we can express f as a function of arc length s, and 
re-parameterize the curve as P(s) = (x(s), y(s), z(s)). The chain rule for differentia- 
tion gives 


P'O = P'(s)s'(t) = P'(s)|P'(t)| (2.34) 


from which we find that P'(s) is equivalent to the unit tangent vector T(t). For 
convenience, we denote P'(s) by T(s). Since T(s)eT(s) — 1, it immediately follows 
that T(s)eT' (s) — 0. Thus the instantaneous rate of change of the tangent direction 
is parallel to the normal vector at that point. If the unit normal direction at P(s) is 
denoted as N(s), we have 


T'(s) = k(s)N (s) (2.35) 


_ d(T(s) 
|. ds 

The proportionality factor «(s) is called the curvature of the curve at P(s). The 
curvature is a measure of the deviation of the curve from a straight line. For a straight 
line, «(s)=0 at all points. The magnitude of the curvature is easily obtained as 
|x(s)| = |T'(s)|, and the unit normal direction at P(s) is given by 
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Fig. 2.8 Frenet frame 
attached to a curve at the 
point P 


Rectifying plane 


P'(s) | P'() x (P(t) x P'O) 


N = = 
O = er] ~ TPP" x P) 


(2.36) 


The plane containing the tangent vector and the normal vector is known as the 
osculating plane. The cross-product of the two unit vectors T(s) and N(s) gives the 
direction of the unit bi-normal vector denoted by B(s): 


P'(s)x P"(s —P'(t) x P"(t) 
IP's) x PO] [PO x P"(5) 


B(s) = T (s) x N (s) = (2.37) 


The three unit vectors T, N, B form an orthonormal basis as shown in Fig. 2.8. 
This local reference system is called the Frenet frame. The derivative of the bi- 
normal vector B'(s) is perpendicular to both B(s) and T(s), and hence parallel to 
N(s): 


_ d(BG) _ 


B) ds 


—t(s)N(s) (2.38) 

The term t(s) is called the torsion of the curve at s. Torsion is a measure of how 
much the curve deviates from the osculating plane. 

The plane containing the tangent and binormal vectors is called the rectifying 
plane (Fig. 2.8). The plane formed by the normal and binormal vectors is called the 
normal plane. 

The Frenet frame is useful for defining the local orientation of objects that move 
along a curved path. It can also be used for defining the eye-coordinate system for a 
camera that undergoes a curvilinear motion. 


2.6 Affine Transformations 


In this section, we consider linear transformations of three-dimensional points and 
vectors. The homogeneous coordinate system (Sect. 2.1) allows all transformations 
including translations to be represented using 4 x 4 matrices. We denote a translation 
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by a vector v = (Xy, Yv, Zy), by T,, a rotation about the x-axis by an angle by 0, by 
R(x), and a scaling by a vector k = (xx, Yk, Zk), by Sy (Box 2.3). 


Fig. 2.9 Examples showing transformations of (a) a translation by an offset vector v (b) a rotation 
about the x-axis by an angle 6 and (c) scaling by factors ky, ky, kz 
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A linear transformation followed by a translation is called an affine transform. A 
general transformation can be given in matrix form as follows: 


X p 400 401 402 403 Xp 

/ 
Y p | — | Go 411 412 413 Yp 

, = (2.39) 
Zp 4290 421 422 423 Zp 

1 0 0 0 1 1 


In the above equation, the matrix elements aj's are all constants. (a3, d13, 
423) denote the translation components, and (xy, Yp, Zp, 1) the point on which the 
transformation is applied. The translation parameters do not have any effect on 
a vector (xy, yy, Zv, 0). Under an affine transformation, line segments transform 
into line segments, and parallel lines transform into parallel lines. A fixed point 
of a transformation is a point that remains invariant under that transformation. For 
example, every point along the x-axis is a fixed point for the transformation Ro(x). 
Similarly, the origin is a fixed point for the scale transformation. The most general 
rotation of an object with the origin as a fixed point, is the rotation by an angle 0 
about an arbitrary vector v = (xy, Yvy, zy, 0) passing through the origin. The matrix 
for this transformation is given below. 


x2A +C XWA-wWB Xz+ yB 


0 
XoAcTnBO y A+C ymnA-xB 0 
0 
1 


Ry(v) = (2.40) 


XywA-—yWB yzA+xB 2A+C 
0 0 0 


where A = (1—cos0), B = sin0, and C = cos. A rotation about an axis parallel to the 
x-axis, with an arbitrary fixed point P, can be obtained by first applying a translation 
T_, from P to the origin, a rotation Ro(x) with origin as the fixed point, and finally a 
translation T, back to the original position P. In matrix form, we write the composite 
transformation as T RoT, !. Here T^! denotes the inverse of the transformation 
T. For a translation, the inverse of T, is T_,; and for a rotation, the inverse of Ro(v) 
is R (v). A transformation of the form TRT! is called the conjugate of R. 

We have just seen a few examples of affine transformations that are commonly 
used for generating new points by transforming existing ones. We could also 
combine the coordinates of a set of points using a linear equation to obtain a new 
point. Such interpolation methods are discussed in the next section. 


2.7 Affine Combinations 


A linear combination of a set of points P; (i = 1,2, . .. n) produces a new point Q as 
shown below: 


Q=} w Pi (2.41) 
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Fig. 2.10 (a) Linear interpolation and (b) trigonometric interpolation between two points 
where the coefficients (weights) w; are constants. If the weights satisfy the condition 


x wj = 1., (2.42) 


i=l 


then Eq. 2.41 gives an affine combination of points. Additionally, if w; > 0, for all i, 
then w;'s form a partition of unity, and Eq. 2.41 is said to give a convex combination 
of points. As a special case, when n — 2, we get the formula for linear interpolation 
between two points P; and P»: 


Q=(1-tHPi+t P», OK<t<1. (2.43) 


An interesting variation of the above equation can be derived by expressing the 
parameter f as a function of an angle a, given by t=cos”a. Then the coefficient 
(1— f) becomes sin?a, and Eq. 2.43 takes the form Q= sina P, +cos?a P5. 
However, this trigonometric interpolation formula gives a non-uniform distribution 
of points on the line when a is varied from 0° to 90? in equal steps. A comparison 
of linear and trigonometric interpolations is given in Fig. 2.10. In Fig. 2.10a, the 
parameter f is varied uniformly in the range [0-1] in steps of 0.1, and in Fig. 2.10b, 
the angle a is varied uniformly in the range [0—90] in steps of 9°. Higher order 
interpolation between points is discussed in Chap. 7 (Box 2.4). 


Box2.4 Bernstein Polynomials 


Given a positive integer value n, we can construct n + 1 polynomials of degree 
n of a parameter f as follows: 


Bin(t) = (1)a- ov. i =0 2. bebo 
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These polynomials form a partition of unity, i.e., > B; (t) = 1. 


i= 
Therefore, they can be used to generate convex combinations of points. Given 
n+ 1 points P;, i=0,...,n, we define a point Q(t) as 


OM = SP BE 


i=0 


As the parameter f is varied from 0 to 1, we get a continuous parametric curve 
called the Bezier curve. The equations for n — 1, 2, 3 are given below. 

First degree (linear): Q(t) = (1—1) Po + t Pi 

Second degree (quadratic) : Q(t) = (1—4) Po + 2(1—7)t P, + 2 P; 

Third degree (cubic) : Q(t) = (1—4)? Po + 3(1—£)? t Pi + 3(1—1)? P5 + PP; 


Fig. 2.11 A bilinear 
interpolation scheme first 
interpolates along the edges 
to get the values at A and B, 
and then uses another linear 
interpolation along the line 
AB to get the value at Q 


Pi P, 


Given a triangle with vertices Pı, Pz and Ps, we can perform a bilinear 
interpolation between the values defined at the vertices to get the interpolated value 
at an interior point Q (Fig. 2.11). Using this scheme, we can compute the colour 
value at any point inside a triangle, given the colour values at the vertices. A scan- 
line parallel to the base of the triangle sweeps the plane and generates the values of 
A and B using the linear interpolation equation in Eq. 2.43 with the same parameter 
t. Another linear interpolation between of A and B with a parameter s gives the value 
of Q. Thus we get 


Q = (1—5)((1 — £2) P1 +tP3}+s{(1—t)P2 +tP3}, O<s,t<1. (244) 


The above equation could be simplified into a simple convex combination of 
vertex points as 


Q = (1 -ki — k2)Pi + kiPo c koP, | Os ky ko, ki Fr ko <1, (2.45) 


where k; = s(1—1) and kz = t. The bilinear interpolation of vertex coordinates shown 
above can be generalized to interpolate any quantity or attribute inside a triangle, 
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given its values at the vertices. Examples of such vertex attributes are colour, texture 
coordinates and normal vectors. In the next section, we will consider another closely 
related interpolation method for triangles. 


2.8 Barycentric Coordinates 


The barycentre of a rigid body is its centre of mass. For a triangle, the barycentre 
is its centroid. Given vertices P, P2, P3 of a triangle, the centroid C can be easily 
computed as the average of the vertex coordinates (P; + P5 + P3)/3. Thus C can 
be represented as a convex combination of the vertex points. Indeed, Eq. 2.45 has 
just shown that any point Q inside the triangle could be expressed as a convex 
combination of vertices. If we re-write Eq. 2.45 as 


Q =A, Pi + A3 P5 + A3P3, 0 < Ay, A2,A3 € 1, Ay tA.+A3 = 1, (2.46) 


then the point Q is uniquely specified by a new set of coordinates (A1, A2, A3) defined 
by P1, P5, and P5. This local coordinate system is called the barycentric coordinates 
for the triangle. Barycentric coordinates are also sometimes referred to as trilinear 
coordinates. From Eq. 2.46 we see that the vertices themselves have barycentric 
coordinates given by 


P, = (1,0,0) 
P, = (0,1,0) 
P5 = (0,0, 1) (2.47) 


As seen earlier, the centroid C has barycentric coordinates (1/3, 1/3, 1/3). The 
barycentric coordinates of a point Q with respect to Pı, P2, P5 have a geometrical 
interpretation as the ratios of the areas of triangles QP2P3, OP3P,, OP\P2 to the 
area of the triangle P, P5P3. In the following equations, the symbol A denotes the 
signed area of a triangle: 


AQ P5 P; AQ P; P; _ AQP:P, 


= APPP T 3 = AP, P P 


= j 2.48 
AP,P;P, ( ) 


1 


The barycentric coordinates given in Eq. 2.48 are unique for every point on the 
plane of the triangle. They can be directly used to get the interpolated value of 
any quantity defined at the vertices of the triangle. If fp, fp2, fp3 denote the values 
of some attribute associated with the vertices, then the interpolated value at Q is 
given by 


fo = ài fri Aa fpa + Às fpa. (2.49) 
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Fig. 2.12 A one-to-one mapping of points from one triangle to another can be obtained using 
barycentric coordinates 


Using barycentric coordinates we can establish a one-to-one mapping of points 
from within one triangle to another. For any given interior point Q of the first 
triangle, we compute the barycentric coordinates. The linear combination of the 
vertices of the second triangle with the barycentric coordinates of Q gives the 
coordinates of the corresponding point R inside the second triangle (Fig. 2.12). 
We can use this mapping to transfer values from the interior of the first triangle 
to the second. As an immediate application of this transfer, we can map an image 
(or texture) from one triangle to another. 

In a simplified two-dimensional case where P,-—(xi, yi), P» = (xo, y2), 
P3 = (xa, y3), Q = (X4, Yq), the expressions for the barycentric coordinates of Q 
given in Eq. 2.48 assume the following form: 


Xq(y2 — ya) + xa(ys — yq) + X3(Vq — ya) 
xi(yo — y3) + xa(ys — yi) + xa(yi — y2) 


A= 


Xa ys — yi) + xaQi — Yq) + X1 (Yq — ya) 
X1(y2 — ya) + x2(y3 — y1) + xa(yi — y2) 


àz = Xq (Yı — V2) + X1 (Y2 — Yq) xi — y) (2.50) 


xı (y2 — ya) + X2(¥3 — yi) + x3(yı — y2) 


If any of the above quantities is negative, then the point Q lies outside the triangle 
PıP2P3, Thus barycentric coordinates find applications in point inclusion tests. 
In a general three-dimensional case, however, the area of a triangle computed using 
Eq. 2.3 would always be positive, and correspondingly the area ratios in Eq. 2.48 
would also be positive. As previously discussed in Sect. 2.2, the computation of 
signed areas of triangles requires a view vector w. Since we need this vector to 
be fixed with respect to every triangle in Eq. 2.48, we can conveniently choose 
w = (P5 —P4) x (P3—P|). Now the barycentric coordinates A;, À» and A3 in Eq. 2.48 
can be computed by applying the formula in Eq. 2.8 to each of the triangles QP5 P3, 
QP3Pı, QPP and P;P2P3. If the conditions 4; + 42 3-23 = 1,0 < Ay, A2,A3 < 1 
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are met, then Q lies on the plane defined by the points P1, P2, P5, and also lies within 
the triangle P;P2P3. Note that in the most general case, the point Q need not be on 
the plane of the triangle. Hence we require the additional condition that the sum of 
barycentric coordinates equals 1 to ensure that the points are coplanar. 

Barycentric coordinates are also useful for finding the centre of a circle that 
passes through three non-collinear points, P, Q, R in three dimensions. Denoting 
the vectors along the sides of the triangle by a — Q—P, b — R—Q, and c — P—R, the 
barycentric coordinates of the centre of the circle are 


—|b[?(c - a) 
Aci M 
i 2\a x b? 
—l|e (a - b) 
jy muc M 
3 2\a x b? 
—la|?^(b - c) 
Janm oO 2.51 
2\a x b? ee") 


The centre of the circle is then given by the following linear combination of the 
three points: 


C = 1P +20 +AgR. (2.52) 


In the following section, we will look at the application of vectors in the Phong- 
Blinn illumination model used for lighting calculations in the OpenGL pipeline. 


2.9 Basic Lighting 


The hardware accelerated lighting model that is traditionally used in Computer 
Graphics applications is based on Phong-Blinn approximation for an omni- 
directional point-light source. A local illumination model that does not account for 
complex effects such as reflections, refractions, shadows and indirect illumination is 
found to be generally adequate for a majority of graphics applications. In this model, 
light-material interaction is simply modelled using a component-wise multiplication 
of material colour and light colour. We can represent colour by a vector comprising 
of red, green and blue components as c = (r, g, b, 0). This vector model can be 
further generalized by replacing the fourth component by k that represents the 
transparency (or opacity) term which can take non-zero values. In the discussion 
that follows, mg, ma, m; denote respectively the ambient, diffuse and specular 
components of material colour, and Ia, Ig, I; the corresponding components of 
the light source. Each of these colour components is typically a 3-tuple consisting 
of red, green and blue values. For notational convenience, we represent Ma by 
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Fig. 2.13 Important vectors h 
and angles between them, . n vA 
used in lighting calculations ü 2 


Light Vector 


Surface Element 


the vector (Fma, ma. bma), La by the vector (Tia, Zia, Dia), and so on. The ambient 
light-material interaction is then modelled by the component-wise vector product 


Ma e I, = (Fmafia; magia» bmabia) (2.53) 


Figure 2.13 shows the geometry of unit vectors used for computing diffuse and 
specular reflections from a surface. From a point P on a surface, s denotes the 
unit vector towards the light source, m the unit surface normal vector, and v the 
unit vector towards the viewer. The perceived intensity of reflection at the viewer's 
position varies with changes in the angles between these vectors. The variations in 
diffuse and specular reflections are represented by multiplicative factors ky and ks 
respectively. According to the Lambertian reflectance model, the intensity of diffuse 
reflection from a surface is uniform in all directions, and varies as the cosine of the 
angle 0 between the light source vector s and the surface normal vector n, and is 
therefore proportional to sen. If the angle between the two vectors is greater than 
90°, the normal vector faces away from the light source vector, and the surface is in 
shadow. In such a situation, the value of k, must be set to 0. We therefore have the 
following view-independent factor for the diffuse term: 


ka = max(s e n, 0) (2.54) 


The specular reflection factor k, is computed as a function of the cosine of the 
angle @ between the direction of unit specular reflection r given by Eq. 2.5 and 
the unit view vector v, with an exponent f known as the shininess term or the 
Phong’s constant. The exponent is useful in controlling the overall brightness and 
the concentration of the specular highlight. 


ks = max(cos/ $, 0) = max((r ev), 0) (2.55) 


The Blinn’s approximation eliminates the need for computing the specular 
reflection vector using Eq. 2.5 by defining a unit vector h along the direction s + v. 
This vector is called the half-way vector. If neh = cosf, then equating the angles on 
either side of h gives 


0+p=0-B+6 (2.56) 
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Fig. 2.14 Schematic of the calculations performed in a basic lighting model 


From the above equation we find that $ = 20. The term rev in Eq. 2.55 can 
therefore be replaced with neh by absorbing the factor 2 in k,. This gives the Blinn’s 
approximation for ks: 


ks = max((n e h)/ ,0). (2.57) 


A schematic of the lighting computation using the Phong-Blinn illumination 
model outlined above is given in Fig. 2.14. 


2.10 Summary 


This chapter reviewed some of the geometrical computations involving points, lines, 
planes, triangles and curves, that are fundamental to many algorithms in computer 
graphics. Important concepts such as homogeneous coordinate representation of 
points, signed angles, signed areas of triangles, and barycentric coordinates were 
outlined. Equations relating to affine transformations and affine combinations of 
points were discussed. This chapter also gave the equations for a basic lighting 
model consisting of ambient, diffuse and specular components of reflection. 

The concepts presented in this chapter will form the foundation for several 
methods that will be discussed in subsequent chapters. The next chapter introduces 
a hierarchical structure that is useful for modelling transformations applied to 
articulated models and other similar objects containing interconnected parts. 


2.44 Supplementary Material for Chap. 2 


The section Chapter2/Code on this book's companion website contains code 
examples demonstrating the application of concepts discussed in this chapter. 
A brief description of these programs is given below. 
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1. Point3.cpp 
Additional files: 


The Point3 class supports most commonly used operations on points 
represented using 4-dimensional homogeneous coordinates. The class has the 
subclass Vec3 that supports vector operations such as dot and cross products, 
vector magnitude calculation and normalization. The documentation of these 
classes can be found in Appendix A. 


2. Triangle.cpp 


sama files: class 
Triangle 


The Triangle class provides methods for computing area, surface normal 
vector, and the barycentric coordinates of a point with respect to a triangle. It also 
has functions for performing the point inclusion test and bilinear interpolation. 
The documentation of this class can be found in Appendix A. 


3. Matrix.cpp 


The Matrix class contains methods for matrix operations (using 4x4 
matrices) such as addition, multiplication, computation of transpose and inverse 
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matrices, and transformation of points. The documentation of this class can be 
found in Appendix A. 


4. Interpolate.cpp 


Additional files: 
None 


The program creates a shape-tween between two user-defined polygonal 
shapes using simple linear interpolation between corresponding vertices. Use left 
mouse clicks on the upper left side of the screen to define the first polygonal 
shape. Similarly, use right mouse clicks on the upper right side of the screen to 
draw the second polygon. Pressing the space bar creates the shape-tween between 
the first and the second polygons in the bottom half of the window. 


5. Bilinear.cpp 


Additional files: 
None 


pU US c 


The program uses Eq. 2.45 to obtain a bilinear interpolation of color values 
at the vertices to fill the interior of a triangle. For comparison, a second similar 
triangle is rendered using the OpenGL pipeline that uses the Gouraud shading 
algorithm. The vertex colours are randomly generated every time the space bar 
is pressed. 


6. Bezier2D.cpp 


Additional files: 
None 


The program uses Bernstein polynomials (Box 2.4) to generate a two- 
dimensional Bezier curve for a set of user-defined control points. Use left mouse 
clicks on the screen to define a set of control points. The control polygonal line 
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is shown in red colour. The Bezier curve for the input points is simultaneously 
drawn in blue colour. 


7. Barycentric.cpp 


Additional files: 


The program uses barycentric mapping (Fig. 2.12) to map points from one 
triangle to another. Two triangles are displayed when the program is initiated. 
Use left mouse clicks inside the left triangle to specify a few points. The points 
are connected using a polygonal line drawn in magenta colour. The map of these 
points and the polygonal line connecting them inside the triangle on the right 
hand side are simultaneously drawn in blue colour. 


2.12 Bibliographical Notes 


Several books on introductory computer graphics provide an outline of concepts 
discussed in this chapter. Some recent publications that can serve as excellent 
references are Angel (2008), Hill and Kelley (2007), and McConnell (2006). 
A number of books give emphasis to the mathematical tools used in computer 
graphics. Notable in this area are Vince and Vince (2006), Lengyel (2004), Buss 
(2003), Schneider and Eberly (2003), and Dunn and Parberry (2002). 

Comninos (2006) gives a comprehensive coverage of topics on vector and matrix 
algebra, transformations, lighting and shading models. A concise description of 
homogeneous coordinates and their applications in computer graphics can be found 
in Vince (2001). Topics in linear algebra and topology that are used in many 
algorithms in computer graphics are discussed at length in Agoston (2005) and Farin 
and Hansford (2005). 
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Chapter 3 
Scene Graphs 


Overview 


A scene graph is a data structure commonly used to represent hierarchical relation- 
ships between transformations applied to a set of objects in a three-dimensional 
scene. It finds applications in a variety of acceleration and rendering algorithms. 
A scene graph could also be used to organize visual attributes, bounding volumes, 
and animations as a hierarchy in a collection of objects. In the most general form, 
any scene related information that can be organized in a hierarchical fashion can be 
stored in a scene graph. It also provides a convenient way of representing logical 
groups of objects formed using their spatial positions or attributes. In this chapter, 
we will outline the fundamental properties of scene graphs, look at some of the 
implementation aspects and consider a few applications. 


3.1 The Basic Structure of a Scene Graph 


The structure and contents of a scene graph will obviously depend on the type of 
information it stores, or equivalently, the set of operations it is used for. Let us 
consider a simple tree structure that contains three types of nodes: 


1. The root node of the tree represents the whole collection of objects in a three- 
dimensional scene. We call this node World or Virtual Universe. The root node 
is a special type of a group node. 

2. A group node is an internal node of the tree. It can contain any number of 
children, and represents a logical grouping of objects. A group node does not 
store geometrical data, but it can contain some semantic information such as 
transformations or visibility attributes applied to a group. 

3. Every leaf node represents either an object or a part of an object, and maintains 
the necessary geometrical information in addition to some semantic information. 
Camera and light sources may also be represented by leaf nodes. 
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Fig. 3.1 An example of a scene graph, where every internal node is a group node and every leaf 
node is an object node 


Fig. 3.2 (a) An example of a model consisting of four connected parts that can move relative to 
each other. (b) A scene graph of the object model 


Figure 3.1 shows an example of a tree with all three types of nodes described 
above. The tree structure of a scene graph allows a property associated with a group 
node to be inherited by all of its child nodes. For example, a transformation applied 
to a group node can be considered as also applied to all its children. Similarly, a 
bounding volume, if attached to a group node, also represents the overall bounding 
volume for the whole collection of its child nodes. 

A scene graph is particularly useful for animating a composite object that has 
several parts which should move as if the parts are all physically connected to each 
other. A typical example of such an object is an articulated character model. We 
illustrate the formation of a scene graph using a simple model consisting of four 
interconnected parts: Base, Part-1, Part-2, and Part-3, as shown in Fig. 3.2. 
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Fig. 3.3 A 5-link joint chain and its scene graph 


As can be seen from the diagram of the scene graph, the whole model is first 
subdivided into three logical groups Part-1, Base and a subgroup Group-2 to which 
Part-2 and Part-3 belong. Shortly we will see how we can assign transformation 
parameters to the individual nodes of the scene graph in such a way that the parts 
can rotate relative to each other while at the same time remaining connected as a 
single animatable object. We now consider a closely related object model, a joint 
chain consisting of five links as shown in Fig. 3.3. 

Joint chains similar to the one shown above are commonly found in robotics and 
articulated models in computer graphics. The scene graph represents a hierarchical 
subdivision of the model, where at the first level, the whole object belongs to a 
single group World. At the next level of subdivision we have Link-/ and a subgroup 
Group-1 that contains the remaining links. Any rotational transformation applied 
to Group-1 affects all members of that group. It may appear that the group node 
Group-4 is redundant as it has only one child. However, the node is useful to provide 
a clear separation between the initial transformations applied to the object in Link-5 
in its own coordinate system and the transformations applied relative to Link-4's 
frame. We will also later add a camera as an object belonging to Group-4. The 
transformation hierarchy represented by scene graphs is explored in more detail in 
the next section. 


3.2 Transformation Hierarchy 


A transformation applied to one part of an object often cascades with the transfor- 
mations applied to the adjacent interconnected parts. For example, a change in the 
orientation of Part-2 of the model in Fig. 3.2a also affects Part-3. Such dependencies 
can be easily converted into hierarchical representations that are suitable for scene 
graphs. We consider below three examples involving hierarchical transformations: 
(i) the model of a mechanical part shown in Fig. 3.2, (ii) an articulated character 
model, and (iii) a small planetary system. 
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Fig. 3.4 A general transformation of the model in Fig. 3.2, showing translational and rotational 
parameters associated with links. The x and y axes denote the reference frame for the world 
coordinate system 


3.2.1 A Mechanical Part 


A general two-dimensional transformation of the model in Fig. 3.2a along with the 
translational and rotational parameters of each link is shown in Fig. 3.4. We will 
use T(a) to denote a translation by a vector a, and R(0) to denote an anticlockwise 
rotation through an angle 0. Note that the joint angles 61, 52, 63 define relative angles 
of rotations of one part with respect to another. In order to build the transformation 
hierarchy, we have to consider first the transformation of each link from its own 
local coordinate frame to the coordinate frame of its group. The sequence in which 
the transformations are applied is shown in Fig. 3.5. 

As shown in Fig. 3.5, transformations are applied from the leaf nodes upward 
to the root of the scene graph. Part-3 is first rotated by an angle 63, and then 
translated along the length of Part-2 by a vector d3. This composite transformation 
has a matrix given by T(d3)R(ó3). Group-2 now contains Part-2 and the transformed 
version of Part-3. In other words, both Part-2 and Part-3 have been transformed 
into the coordinate space of Group-2. It should be noted here that any rotational 
transformation of Part-2 is always applied to Group-2. The transformation matrix 
T(d2)R(62), effectively converts the points from the coordinate system of Group-2 
to that of its parent group, Group-1. Figure 3.6 shows the scene graph with the 
transformation matrices added to the tree nodes. 

From the above discussion, we note that every node transformation is defined 
relative to the node's parent. At a leaf node, a transformation converts vertices from 
the local coordinate space of an object to its parent's coordinate space. If an object 
node has an identity transformation I, it only shows that its parent's node has the 
same coordinate reference frame as the object node. This also means that any trans- 
formation applied to that node is actually applied to its parent group node. In the 
above example, transformations applied to the Base are actually applied to Group-1, 
and they indirectly affect the transformations of each of Group-1's child nodes. 
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Fig. 3.5 Each moveable component of an object model is transformed from its local coordinate 
space to its group’s space, and subsequently to the coordinate space of the group’s parent 
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Fig. 3.6 Scene graph with transformation matrices attached to nodes 


3.2.2 A Simple Character Model 


We now consider an articulated character model and its scene graph shown in 
Fig. 3.7. As in the previous example, we can define the translational and rotational 
transformations for each node, based on the joint position and angle of each link 
relative to its parent. Vectors vı ...vo denote the offsets of the origin of the links 
relative to their parent's local coordinate system in the initial configuration. The 
vector vo denotes the position of the base link (Torso) in the world coordinate frame. 
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Fig. 3.7 Scene graph of a basic articulated character model 


The angles Yx, Wy, V. represent a generalized rotation of the whole model in terms 
of Euler angles defined with respect to the principal axes of the world coordinate 
system. A detailed description of Euler angle rotations can be found in Sect. 5.4.1. 

The model can be animated using key-frame sequences for the joint angles 
01..0o, and its position and orientation can be controlled using key-frame sequences 
for vo, Wx, Wy, and y;. The transformation hierarchy, if properly defined, ensures 
that the links stay connected and are rotated only about the joints. Owing to the 
symmetry of the model, we can also make use of the following relationships among 
the components (x;, Yi, Zi) of translational parameters v;: 


X2 = —X4; Y2= ya 
X3 = —Xs; ya = ys 
X6 = —Xs8; Yo = ys 
X; = —Xoj Yr = yo Q.1) 


3.2.3 A Planetary System 


As the third example, we consider a simple planetary system consisting of the 
Sun, the Earth and the Moon. The translational and rotational parameters used in 
modelling the system are shown in Fig. 3.8. 

The rotation angles 0r, Oy represent the spin of the Earth and the Moon 
respectively about vertical axes, g denotes the revolution of the Earth-Moon 
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Fig. 3.8 A simple planetary system showing the translational and rotational parameters used for 
the construction of its scene graph 
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Fig. 3.9 Scene graph of the planetary system 


system around the Sun, and $ the revolution of the Moon around the Earth. The 
scene graph for this system is shown in Fig. 3.9. 

One notable difference between the planetary system example and the previous 
ones is the form of transformation matrices applied to nodes. Most of the transfor- 
mations applied in a hierarchical fashion have a general form T(v)R(0), which is a 
rotation followed by a translation. In simple implementations, the structure of nodes 
is often designed to accept only transformations of the form T(v)R(0) or I. Scene 
graphs where transformations at internal nodes have one of the forms I, T(v), R(0), 
or T()R(0) are said to be in the standard form. The example given in Fig. 3.9 is 
an exception to this rule. However, this scene graph can be easily converted to the 
standard form with the addition of a group node as shown in Fig. 3.10. 

The equivalence of the scene graphs in Figs. 3.9 and 3.10 can be verified by 
obtaining the combined final transformation matrices applied to the leaf nodes. In 
a scene graph, transformations are combined using a recursive procedure starting at 
the root node, accumulating transformations at internal nodes and ending at object 
nodes. This process will be explained in detail in the next section. 
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Fig. 3.10 The scene graph in Fig. 3.9 converted to the standard form 


3.3 Relative Transformations 


The transformation of one node relative to another can be readily obtained from 
a scene graph. The model transformation matrix of an object gives the composite 
transformation that converts points from the local coordinate space of the object 
to the world coordinate space. In a scene graph, this is the transformation of the 
object node relative to the root (the world node). The composite matrix can be 
obtained by collecting all matrices along the path from the root node to the leaf 
node representing the object. At each node, the matrix is post-multiplied by the 
transformation matrix of that node. The process is illustrated in Fig. 3.11, where 
node transformation matrices are denoted by letters A..G. The model transformation 
matrix of the object node in the figure is ABCDE. 

Leaf nodes can also be used to represent fictitious objects such as light sources 
and camera. In Fig 3.11, the transformation from the coordinate system of the 
camera to world coordinates is given by AFG. The inverse of this matrix, (AFG)!, 
transforms a point from world space to camera space. This matrix is called the 
view matrix. The combined model-view matrix that transforms the object's local 
coordinates to camera space is therefore given by (AFG) ! ABCDE, or equivalently, 
G^! F^! BCDE. An upward tree traversal from a leaf node to root can be quickly 
performed if every node has a pointer to its parent. On the other hand, a downward 
traversal would typically require a recursive algorithm similar to the depth-first 
search method. 

The above example can be generalized to a procedure for finding the transfor- 
mation from one object's local coordinate frame to another's. If we require the 
transformation from Object-1 (source) to Object-2 (target) in a scene graph, we 
have to first find the Lowest Common Ancestor (LCA) of both the object nodes. Let 
the transformation matrix of this common ancestor be denoted by M (Fig. 3.12). 
Let S, ... Sm denote the transformations of nodes starting from the child of LCA 
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Fig. 3.11 Computation of the model transformation matrix of an object represented by a leaf node 
in a scene graph 


Fig. 3.12 Representing Object-1’s coordinates relative to Object-2’s local reference frame 
requires the computation of the Lowest Common Ancestor (LCA) of both the nodes 


towards Object-1, and T,..T, the transformations towards Object-2 as shown in 
Fig. 3.12. The composite transformation from the source's frame to the target's 
frame is given by the matrix T, ! .. T4! S1..Sa. Note that this matrix product does 
not involve the transformation M of the LCA or any of its ancestors. 

There are several well-known algorithms to compute the Lowest Common 
Ancestor of two nodes in a tree. A simple method uses two lists of nodes visited 
in sequential upward traversals of the tree from the two nodes towards the root. 
The last item of both lists would be the world node. Corresponding entries in the 
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Fig. 3.13 An algorithm for finding the Lowest Common Ancestor 


lists are compared for equality, starting from the last item towards the beginning 
of each list. The process of comparison stops when the list entries are different. 
The previous matched entry in the lists gives the reference to the Lowest Common 
Ancestor (Fig. 3.13). 


3.4 Bounding Volume Hierarchy 


Bounding volumes of objects are used for fast collision detection and also in 
acceleration algorithms such as view frustum culling. Bounding volumes can be 
computed for different moving parts of an object and then combined in a hierarchical 
manner to obtain the overall bounding volume (Fig. 3.14). The geometric parameters 
defining a bounding volume can be stored in a scene graph node, and computed on 
the fly whenever a transformation is applied to the vertices. 

Commonly used bounding volumes are axis-aligned bounding boxes (AABB), 
oriented bounding boxes (OBB), spheres, discrete oriented polytopes, and convex 
hulls. Each bounding volume has certain advantages and limitations over others, 
and is suitable for a specific set of applications. An AABB can be computed and 
represented using six parameters that define the minimum and maximum values of 
x, y, and z coordinates of points it encloses. However, these parameters will have 
to be recomputed every time an object is rotated. On the other hand, OBBs and 
spheres are rotation invariant. In this chapter, examples are provided using AABBs 
and spheres only. Other types of bounding volumes and their computational aspects 
are discussed in detail in Chap. 9. 

Since the bounding volume parameters depend on the transformed object 
coordinates, bounding volume updates can be performed only after applying the 
transformations. Unlike transformations, this process starts at the nodes containing 
object primitives, and the bounding volume parameters of group nodes are updated 
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Fig. 3.14 Two-dimensional bounding volume hierarchies for the model in Fig. 3.2, using axis- 
aligned rectangles (top row) and circles (bottom row) 


Fig. 3.15 (a) Bounding circles of two objects. (b) Combined bounding circle formed using the 
parameters of the two component bounding circles. (c) The minimal bounding circle 


based on the computed values at the child nodes. It is therefore often desirable that 
the parameters defining a bounding volume stored at a group node can be computed 
based on the bounding volume parameters of its child nodes. It should also be noted 
here that such a computation may not always yield a minimal bounding volume. 
For example, the bounding sphere computed as the union of two bounding spheres 
may not necessarily be the minimal bounding sphere for the union of points within 
those spheres. A two-dimensional equivalent of this case is shown in Fig. 3.15, using 
bounding circles of two objects. 

We discuss below the process of updating the bounding volume parameters 
(using AABBs and spheres as examples) at a group node based on the updated 
parameters of its child nodes. If there are n child nodes, we combine the volumes of 
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Box 3.1 Bounding Volumes 


Given a set of mesh vertices with coordinates {x;, y; zi], i— 0... N—1, 
the bounding volume parameters for AABB and sphere are computed as 
follows: 


Axis Aligned Bounding Box (AABB): (Xin. Ymin» Zmins Xmax» Ymax» Zmax ) 


Gata = Woy (Ce), Xma = MAA (Ce) 
Ymin = min; (yi), Ymax = max; (y;) 
Zmin = Min; (zi) ^. £max = Max; (zi) 
Sphere: (u, v, w, r} 
Computation of bounding sphere using the geometric centre of points: 


Nel 
u— x $. Xi, di = (x; — u)? + (y; — v + (zi — wy, 
i=0 
i=0...N—1. 
= 
VA Lv 


N-1 S: 
w= x du: r= /max; (d;) 
Computation of bounding sphere using AABB of points: 


1 1 1 
u= 2 (Garin F a) p WE 2 (Ymin qr Wares) 5 WS 2 (ann T ue) 


1 
p 2 V Gas ry Sain) ar (Vmax ms VENE + max — Zmin)" 


two children at a time and obtain the final bounding volume of the parent, in n—1 
steps. Given two AABBs with parameters (Xmini, Yminl» Zminl» Xmaxl> Ymaxl» Zmaxl) 
and (Xmin2, Ymin2» Zmin2» Xmax2» Ymax2» Zmax2}, the combined volume has parameters 
imin(ipni, Xmin2), min(ymini, Ymin2). min(Zmin1; Zmin2), IDaX(Xmax1 Xmax2); maX(Ymax1> 
Ymax2)s maXx(Zmax1; Zmax2)} (Box 3.1). 

In the case of spheres, let the parameters of the two volumes be given by 
iui, vi, W1, ri} and (uo, v2, w2, r2}. The required parameters of the combined sphere 
are denoted as (u., vc, We, Fe}. First we compute the distance between the centres: 


d =y —m)? + — Y + Q5 wi? (3.2) 


If d < |r, — r?|, then one of the spheres is inside the other. The combined sphere 
in this case is the same as the larger among the two spheres. If d > |r, — r2|, the 
spheres either overlap or are disjoint. For this configuration, we compute the radius 
and the centre of the combined sphere as follows: 
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We = wi + 2d (d — ri + r2) (w2 — wi) (3.3) 


A detailed description of different types of bounding volumes, their computation 
and intersection tests is given later in Sect. 9.1. 


3.5 Sample Implementation 


In this section, we will discuss the design of a set of classes that implement the 
functionality of a scene graph with transformation matrices attached to its nodes. 
Internal nodes that can store a list of children, and also a transformation matrix, 
are represented by the class GroupNode. All transformation matrices are assumed 
to have the general form given by T(v)R(0). The properties of leaf nodes are 
specified by three classes: ObjectNode that can represent a three-dimensional 
object, CameraNode that represents the camera, and LightNode that represents 
a light source. These three classes are derived from GroupNode so that we can 
store all child nodes (including group nodes and object nodes) with the same type, 
and also use polymorphic functions to implement tree traversal algorithms. 


3.5.1 Group Node 


The declarations of attributes and functions of GroupNode can be found in 
Listing 3.1 below. The primary functions associated with a group node include 
adding and removing children, and setting the transformation parameters. We use 
the List container of the Standard Template Library (STL) for storing references 
to the child nodes. The data members angleX, angleY, angleZ specify 
the Euler angles of rotation about the principal axes of the group's coordinate 
frame. Similarly tx, ty, tz denote the components of the translation vector 
along the principal axes directions. Together, these attributes define the composite 
transformation for the group node in the form T(v) R.(y.) Ry(Wy)Rx(Wx), where v 
is the translation vector, and ws denote Euler angles. The function render () is 
called on the root node to render the scene. 
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Listing 3.1 Class definition for a group node 


#include <list> 
using namespace std; 


class GroupNode 
{ 
private: 
list<GroupNode*> children; 
protected: 
GroupNode* parent; 
float tx, ty, tz, _angleX, _angleY,  angleZ; 
virtual void draw(); 


public: 
GroupNode () 
: parent (NULL), 
_tx(0.0), ty(0.0), tz(0.0), 
 angleX(0.0), _angleY(0.0), _angleZ(0.0) {} 
virtual ~GroupNode() {} 


void addChild(GroupNode* node); 

void removeChild(GroupNode* node); 

void translate(float tx, float ty, float tz); 
void rotateX(float angle); 

void rotateY(float angle); 

void rotateZ(float angle) 
void inverseTransform()const; 
void render(); 

GroupNode* getParent() const; 
int getChildCount() const; 


r 
r 


3.5.2 Object Node 


The class definition for an object node must cater to the requirements of defining 
and storing three-dimensional object models. Listing 3.2 gives the declarations of 
important attributes and functions of the class. To simplify the implementation, 
we use only the built-in objects provided by the GL Utility Toolkit (GLUT) 
of the OpenGL API. These objects are assigned numbers using the enumerated 
type ObjType. When an object is initially defined using the setObject () 
function, it may also be optionally scaled using parameters scaleX, scaleY 
and _scaleZ. These parameters are used to set the values of the corresponding 
data members of the class. An object may also be given a material colour using the 
function setColor(). A scene is rendered by calling the function render () 
of the GroupNode class on an instance that represents the scene graph's root. 
This function in turn calls the polymorphic function draw () which is declared as 
virtual in GroupNode. The implementation of the function in ObjectNode 
will call the necessary OpenGL functions to apply the transformations and to draw 
the object. 
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Listing 3.2 Class definition for an object node 
class ObjectNode : public GroupNode 
{ 
public: 
enum ObjType 
{ CUBE, SPHERE, TORUS, TEAPOT, CONE, TETRAHEDRON }; 
ObjectNode() 
GroupNode(), 
| object (CUBE), 
.ScaleX(1.0f), _scaleY(1.0f), _scaleZ(1.0f), 
_colorR(1.0£), _colorG(1.0f), _colorB(1.0f) 
B 
^ObjectNode() {} 
void setObject (ObjType object, 
float scaleX, float scaleY, float scaleZ); 
void setColor(float colorR, float colorG, float colorB); 


private: 
ObjType object; 
float  scaleX,  scaleY,  scaleZ; 


float colorR, _colorG,  colorB; 
void draw(); 
he 


3.5.3 Camera Node 


Any three-dimensional scene is assumed to have an active camera that contains 
information about the projective transformation used while rendering the scene. The 
camera also provides the view matrix needed for the transformation of vertices to 
the eye coordinate space. A camera can be added to a scene graph as a special type 
of object node. Listing 3.3 gives the class definition for the camera node. Since 
only one instance of the camera is used in a scene at any point in time, the class 
cameraNode is defined as a singleton class. It has a private constructor, and the 
static instance is made available to a program using the function get Instance (). 
The frustum parameters are specified by an application by calling the function 
perspective (). The function projection () uses these parameters to set up 
the projection matrix, and is called by render () of the GroupNode class. The 
view transformation matrix is constructed by the function viewTransform() by 
traversing the tree along the path from the camera node to the root node (Fig. 3.11). 
The class does not store any drawable object, and therefore draw () has an empty 
function body. 


3.5.4 Light Node 


The LightNode class as defined in Listing 3.4 has a simple structure containing no 
public functions other than the constructor. The constructor accepts a single integer 
between 0 and 7 as the argument which directly represents one of the OpenGL light 
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Listing 3.3 Class definition for a camera node 
class CameraNode : public GroupNode 


{ 


private: 

float fov, aspect, near, far; 

static bool flag; 

void draw() {}; 

CameraNode() 
GroupNode(), 
 fov(60.0f), _aspect(1.0f), 
 near(1.0f), _far(1000.0f) 


{} 
CameraNode (const CameraNode&) ; 


CameraNode& operator = (const CameraNode&); 
static CameraNode* camera; 
public: 


static CameraNode* getInstance(); 
void perspective 
(float fov, float aspect, float near, float far); 
void projection()const; 
void viewTransform()const; 
^CameraNode() { flag = false; } 
}; 


Listing 3.4 Class definition for a light node 
class LightNode : public GroupNode 
i 
private: 

int glLight; 

void draw(); 
public: 

LightNode(int glLight) 
GroupNode(), 
 glLight (glLight) 

{} 
~LightNode() {} 
}; 


sources GL_LIGHTO, ... , GLLLIGHT7. In OpenGL, light sources are transformed 
like any other point. The function draw () defines the initial position of the light 
source at (0,0,0), and transforms it exactly like its counterpart in ObjectNode. 
The class does not store or set any other light or material properties. They can be 
set by the application by directly calling the appropriate OpenGL functions. The 
same applies to setting OpenGL states such as enabling lighting, selecting two sided 
lighting, enabling colour material, and so on. 

The sample implementation of a scene graph discussed above concatenates only 
transformation matrices along different paths from the root node to the leaf nodes. 
The hierarchical structure of a scene graph allows several other attributes to be 
propagated from an internal node to object nodes through various branches. One 
such attribute is the visibility of a node. If a node's visibility attribute is set to false, 
then the visibility attribute of every node in that sub-tree can also be implicitly set to 
false by using a logical AND operation with the values from the parent nodes. Thus 
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an object node will not be rendered if any of its ancestors has a visibility attribute 
set to false. A similar attribute that can be attached to the nodes is transparency. The 
transparency values can be multiplied together along every path from the root node, 
to determine the net transparency of objects stored in the leaf nodes. 


3.6 First-Person View 


The design of the camera node as outlined in the previous section permits a highly 
flexible implementation of a scene graph, since the only static instance of the class 
can be obtained anywhere by calling the get Instance () function. The camera 
node need not even be a part of the scene graph, if the camera is meant to be in a 
fixed location with respect to the scene. In this case, the transformations defined for 
the camera node specify the position and the orientation of the camera with respect 
to the origin of the world coordinate frame. These transformations will be directly 
used to obtain the view matrix for the whole scene. 

Often you will require the first-person view of a scene with the camera placed on 
a moving object. For the articulated character model in Fig. 3.7, the first-person view 
is provided when the camera is attached to the head. This is done by first applying 
transformations to the camera node so that it points to the right direction in the 
coordinate frame of the object node to which it should be attached. In the scene 
graph, the object node is replaced by a new group node. Both the camera node and 
the object node are attached to the new group node as its children. Figure 3.16 
shows the reference frame (xe, Ye, Ze) of the camera and the coordinate frame 
(x, y, z) of the head of the character model. The camera initially points towards —z, 
direction. It is rotated about the y-axis by 180? to point towards the head direction. 
This transformation is represented by the matrix R(@). Figure 3.16 also shows the 
modified portion of the scene graph in Fig. 3.7 with the addition of a new group 
node and the camera node. 

Now consider the 5-link joint chain shown in Fig. 3.3. Robotic arms such as this 
can be found in autonomous systems for inspection, welding and painting. The arm 
is driven by feeding joint angles to the controllers. Some constraints may be applied 
to the joint angles based on the application requirements. For example, a robotic 
arm for welding or painting may require the end effector (denoted by Link-5 in 
Fig. 3.3) to be kept in a horizontal position. It may also be required to have a camera 
attached to the end effector to obtain a clear perspective of the surrounding scene 
from its viewpoint. The graphical rendering of the scene as viewed from the position 
of Link-5 can be obtained by adding the camera node to the group node Group-4 as 
shown in Fig. 3.17. 

From the previous examples, we have seen that the first step in the process 
of attaching a camera to an object node is to determine the transformation R(@) 
necessary to appropriately orient the camera in the local coordinate frame of the 
object. In the example in Fig. 3.17, this composite transformation comprises of two 
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Fig. 3.16 (a) Camera coordinate system. (b) A 3D object “Head” in its local coordinate frame. 
(c) The modified portion of the scene graph in Fig. 3.7, with the camera node attached 
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Fig. 3.17 (a) Local coordinate frame of a link of the joint chain in Fig. 3.3. (b) The desired 
orientation of the camera frame relative to the frame of the link. (c) Addition of the camera node 
to the scene graph in Fig. 3.3 


rotations: a rotation of 90? about the x-axis followed by another rotation of —90° 
about the y-axis. The transformation functions given in Listing 3.1 allow us to define 
such rotations. It is also important to note that when a new group node is formed 
with the camera node and the object node as its children, transformations that were 
previously applied to the object node should now be applied to the camera as well. 
Therefore, the transformation matrix that was attached to the object node must now 
be transferred to the common group node. This would often leave the object node 
with the identity matrix as shown in Fig. 3.17. 
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3.7 Summary 


Scene graphs are powerful data structures that can be used for hierarchical rep- 
resentations of transformations, bounding volumes and other visual attributes of 
groups of objects in a scene. This chapter showed the application of scene graphs in 
defining the transformations of interconnected systems. Robotic manipulator arms 
and articulated character models are examples of such systems containing one or 
more joint chains. Using a scene graph, the relative transformation of one object 
with respect to another can be easily computed. Relative transformations are useful 
for displaying billboards and first person views. This chapter also introduced the 
definition of a scene graph in the standard form. An object oriented framework 
for a scene graph was presented and some of the key implementation aspects were 
discussed. 

The next chapter will show that scene graphs play an important role in skeletal 
animation. Skeletal structures and the associated hierarchical transformations used 
in vertex skinning algorithms fit perfectly well with the scene graph model. 


3.8 Supplementary Material for Chap. 3 


The folder Chapter3/Code on the companion website contains code examples 
demonstrating the application of the scene graph class in the modelling and 
rendering of simple three-dimensional scenes. A brief description of these programs 
is given below. 


1. GroupNode.cpp 


Additional files: 


These are the header and implementation files for a scene graph class as 
discussed in Sect. 3.5. The documentation of methods in this class can be found 
in Appendix B. 
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2. Scene3D.cpp 
Additional files: O 


or w d 


This program uses a scene graph to model a scene consisting of four different 
stationary objects and demonstrates the use of the classes discussed in Sect. 3.5. 
The scene graph has a simple structure consisting of the World node and 
four object nodes. The camera node is not attached to the scene graph and 
is independently transformed to simulate camera motion along a circular path 
around the group of objects. The light source is kept fixed in the middle of the 
scene, at its default position (0, 0, 0). 


3. Planet.cpp 


Additional files: 


GroupNode.cpp 
GroupNode.h © 


This program uses the scene graph in Fig. 3.10, to model the planetary system 
in Fig. 3.8. The angles of revolution of the Moon around the Earth, and the 
joint Earth-Moon system around the Sun are continuously updated to generate 
an animation sequence. The light source is kept fixed at the location of the Sun. 


4. Link5.cpp 


Additional files: 
Joi e P .txt 
( = 


This program uses the scene graph model given in Fig. 3.3, to construct 
an animated 5-link robotic arm. The joint angles are read in from the file 
JointAngles.txt. The arm moves continuously up and down in front of 
a vertical coloured wall. The joint angles are defined such that the end effector 
of the arm is always horizontal. Pressing ‘c’ on the keyboard causes the scene 
graph to be modified as in Fig. 3.17 to produce the effect of the camera being 
placed on the end effector. This gives a close-up view of the coloured wall from 
the perspective of the continuously moving end effector. 
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5. GlutMan.cpp 


Additional files: CJ 


GroupNode.cpp 


The program GlutMan demonstrates the use of a scene graph in modelling 
and animating an articulated character model. A scene graph similar to the one 
given in Fig. 3.7 is used. The values of eight joint angles defining a simple 
walk sequence are read from the input file WalkCycle.txt and interpolated 
to generate a continuous animation sequence. 


3.9 Bibliographical Notes 


An excellent introduction to scene graphs and other tools for scene management 
can be found in Sherrod (2007). The book also deals with the design of data 
structures and algorithms for similar applications. Angel (2008), McConnell (2006) 
and McReynolds and Blythe (2005) give an overview of hierarchical modelling 
techniques and applications using scene graphs. Eberly (2007) contains a chapter 
on hierarchical scene representations, and provides a detailed description of scene 
graph operations designed for merging a set of bounding volumes. 

Support for scene graphs including sophisticated high-level functionalities can 
be found in graphics APIs. Java-3D provides powerful classes for constructing the 
nodes of a scene graph that can be used for rendering scenes. Many examples of 
applications in Java can be found in Davison (2005). The M3G API of Java Micro 
Edition also contains a versatile collection of methods useful for retained-mode 
rendering based on scene graphs. These methods incorporate high-level functions 
for generating key-frame animations on mobile devices. Pulli (2008) provides an 
excellent coverage of the M3G API and shows the importance of scene graphs in 
the design of animation sequences. 

OpenSceneGraph is a versatile high-level 3D graphics toolkit useful for the 
development of high-end graphics applications based on a full-fledged and powerful 
scene graph implementation. More information can be found on the website, 
http://www.openscenegraph.org. 
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Chapter 4 
Skeletal Animation 


Overview 


This chapter discusses concepts such as vertex blending, vertex skinning and 
keyframing that are fundamental to the animation of articulated character models. 
Vertex blending is the process of constructing blending surfaces between two 
different parts that move relative to each other, in order to create the appearance of 
a single deformable object. Vertex blending is useful in the animation of character 
models constructed by joining together several individual components. 

Mesh models of animatable characters are often subdivided into groups of 
vertices that represent moveable body parts. A skeleton is an abstract representation 
of this form of partitioning of a mesh. Skeletal animation refers to the process 
of computing the transformations of each segment in the skeleton using joint 
angles, and mapping them on to mesh vertices. The chapter discusses various stages 
in skeletal animation, describes the transformations applied to a mesh, and also 
outlines a scene graph based implementation. 


4.1 Articulated Character Models 


Animated character models can be found in numerous applications of computer 
graphics, ranging from simple computer games to virtual agents and computer gen- 
erated feature films. Depending on the application requirements, the character mesh 
and the animation sequence can have varying levels of complexity. Sophisticated 
virtual character agents incorporate several forms of articulation including facial 
expression animation. In this chapter we will look at the basics of human character 
animation with simple polygonal models and a small number of joint angles. 

We broadly classify character models into two groups: (i) character models 
constructed using several objects or “parts” where each object is independently 
transformed and moved into its respective position within the model, and (ii) single 
mesh models that are animated by attaching vertices to different transformation 
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Fig. 4.1 Character models 
constructed using (a) several 
component objects, and 

(b) a single mesh 


groups. An example of each type is shown in Fig. 4.1. The first model, the “Glut 
Man”, is constructed entirely using scaled and transformed versions of cubes 
generated using glutSolidCube() or glutWireCube(), hence the name. 
The second belongs to the more commonly found class of mesh models. 

In the case of the model constructed using individual parts, each component 
is first created in its own local coordinate space. A series of transformations is 
then applied to it based on where in a joint chain that component appears. This 
process, which is very similar to what we saw in the previous chapter (Fig. 3.5), 
is repeated for every part of the model to reshape the character in a required pose. 
The transformations often have a well-defined hierarchical structure as discussed in 
the context of scene graphs. Figure 3.7 shows how the main body parts of a simple 
humanoid model are transformed. 

A character model defined using a single mesh surface as in Fig. 4.1b requires 
a completely different set of coordinate transformations, as all mesh vertices are 
specified in a common reference system. However, we should be able to use the 
same set of joint angles to animate this model also, producing a similar effect 
such as a walk cycle. We can indeed construct a *virtual" skeleton consisting of 
joints and links that has a structure similar to our previous model in Fig. 4.1a. 
We can then associate the skeleton with the continuous mesh. This association is 
done by attaching a set of vertices belonging to each body part (e.g., forearm) 
to the corresponding link of the skeleton. The scene graph based transformations 
computed using joint angles can now be directly applied to the skeleton. The mesh 
vertices are transformed using a simple method introduced in Sect. 4.4.2. 

If a model is made up of several parts as in Fig. 4.1a, where parts move or 
rotate relative to their neighbours, gaps can appear at joints when the model is 
animated. The next section addresses this problem, and introduces the method of 
vertex blending for creating deformable surface patches between parts that move 
relative to each other. 
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When two different mesh objects attached to a common pivot rotate by different 
angles, certain parts of the surfaces can interpenetrate, and gaps can appear on the 
opposite side (Fig. 4.2a). Repairing or “re-meshing” an area where two surfaces 
interpenetrate is a difficult task. Moveable surfaces are therefore often separated by 
a small distance from each other, so that they do not touch for the allowable range 
of movement or rotation angles (Fig. 4.2b). A sphere is sometimes placed at rotary 
joints, as in Fig. 4.2c, to fill the gap. While this approach is suitable for robot-like 
models, interpolation methods could be used for obtaining a better approximation of 
blending surfaces between moving parts. The process of creating such in-between 
surfaces is called vertex blending. 

Corresponding pairs of points on two moving parts can be joined together to form 
a triangular or quadrilateral element belonging to the intermediate surface. These 
elements could be further subdivided using a simple linear interpolation formula 
(Eq. 2.43) to get a tessellated surface (Fig. 4.3a). We discuss below higher order 
interpolation methods for generating blending surfaces (Fig. 4.3b). 

In Chap. 2 (Sect. 2.7) we saw examples of second and higher degree interpolation 
functions with Bernstein polynomials as basis. We can use cubic Bezier polynomials 
to generate interpolating curves between moving parts with tangential continuity at 
end points. In Fig. 4.4a, Po and P3 denote a pair of corresponding points on two 
moving parts of a character model. Qo and Q3 are two points on the surfaces that 
are selected to define the local tangent directions Pp—Qp and P3— Q3 respectively. 
Using these tangent directions, we can specify two more points, P, and P», as 


Pi — Po + a(Po — Qo) 
P, =P; + a(P3 — Q3) (4.1) 


where a is a positive quantity used to increase or decrease the length of the tangent 
vectors P, — Po and P5 — P3. Points on the interpolating Bezier curve are generated 


Fig. 4.2 (a) Moving parts of 
an animated model can 
interpenetrate and form gaps 
at joints. (b) Links can be 
separated by a short distance 
to avoid surface intersections. 
(c) A sphere is sometimes 
attached to a rotary joint to 
fill the gap between two 
moving parts 


56 4 Skeletal Animation 


Fig. 4.3 Generation of 
blending surfaces using 
(a) linear interpolation and 
(b) Hermite interpolation 


Fig. 4.4 Generation of a 
blending surface using 

(a) Bezier interpolation and 
(b) Hermite interpolation 


using the parametric equation (see Box 2.4, Sect. 2.7) 
O(t) 2 (1-1? Po -3(00 t? tPi 300—008 Po +t? P, O<t <1. (42) 
Substituting the expressions from Eq. 4.1 in the above equation gives 


Q(t) =(1 — 3? + 205) Po + 3(1 — tt a(Po — Oo) 
+ 3(1 —1)t? a(P3 — Q3) + BGP — 20) Ps (4.3) 
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When o is increased, the weight of the tangent vectors on the interpolating curve 
is increased, and the curve gets closer to the tangents at the end points Po, P3. Care 
should be taken to ensure that the points Po, P, both lie on the same side of the 
tangent P2 — P3, and similarly points P2, P3 lie on the same side of the tangent 
P, — Po. Setting a large value of o violates this condition, resulting in a distorted 
Bezier curve. 

A second interpolation method that is suitable for vertex blending is Hermite 
interpolation. Here, the tangent directions are defined using vectors Po—Qo and 
Qs — P3 (Fig. 4.4b), and the interpolating curve is given by 


H(t) =(1 — 3t? 200) Py + (t 28 + t°) (Po — Qo) 
+ (t° + t°)a(Q3 — P3) + Bt? — 20) P, (4.4) 


The coefficients of Po, P3 are exactly same as that of Bezier interpolation. 
Since tangents are defined along the direction of the curve from Po to P3, Hermite 
interpolation does not have problems associated with large œ values. Hermite and 
other types of approximating splines are discussed in more detail in Chap. 7. 


4.3 Skeleton and Skin 


Animating a three-dimensional character model (Fig. 4.1b) containing hundreds of 
vertices and polygons can be a challenging task. This task can be simplified to a 
great extent by grouping together a number of mesh vertices as forming body parts 
that move as a single unit, connected together by a set of joints. A human model 
may be modelled as a collection of body parts with joints at neck, shoulders, elbows, 
wrists, hips, knees, and ankles. The grouping of mesh primitives into body parts and 
the definition of joints depend on the complexity of the animation. In a simple walk 
sequence, for instance, the arms and legs could be considered as the only parts that 
move relative to the main body. For a more complex animation, one might require 
movement of the head, hands, fingers, facial muscle regions, and so on. Figure 4.5a 
shows how points in a mesh could be grouped into ten body parts: head (HEA), torso 
(TOR), left upper arm (LUA), left lower arm (LLA), right upper arm (RUA), right 
lower arm (RLA), left upper leg (LUL), lower left leg (LLL), right upper leg (RUL), 
and right lower leg (RLL). Every group can then have an abstract representation 
called a bone. The complete set of bones, along with their connectivity information, 
is called a skeleton (Fig. 4.5b). 

The notion of a skeleton consisting of a set of joint chains comprised of bones is 
central to articulated character animation. A skeleton can be easily animated; i.e., the 
transformations for the bones can be easily determined given the angles at each joint. 
The skeleton has the hierarchical structure similar to that of the model in Fig. 4.12, 
the main difference being that in a skeleton, each component or bone is just an 
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Fig. 4.5 (a) Vertices in a mesh model are grouped together into parts that move relative to each 
other. (b) A skeleton definition formed based on a vertex grouping 
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Fig. 4.6 Two simple ways of associating vertices with bones of a skeleton 


abstract structure, not a graphics primitive. A bone essentially stores information 
about its position and orientation relative to its parent in the skeleton. 

Every bone is given a unique index as shown in Fig. 4.5b. Vertices belonging 
to a group are associated with a bone using the bone's index. The part of a mesh 
represented by a bone is called its skin. In the example given in Fig. 4.5, the skin 
of bone “8” is the mesh segment that belongs to the set LUL. Two simple ways of 
associating groups of vertices with bones are shown in Fig. 4.6. In the first method, 
every entry in the vertex list is appended with a bone index. This method is suitable 
when vertices need to be associated with more than one bone (we will discuss this 
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process later in Sect. 4.6). If several consecutive entries in the vertex list have the 
same bone index, then the second method is preferred where the minimum and 
maximum indices of a range of vertices are stored against a bone index. 


4.4 Vertex Skinning 


In order to define the hierarchical nature of a skeleton, the parent-child relationship 
between every two connected bones must be shown. We could represent a bone 
using a point with arrow(s) pointing to its child node(s), as in Figs. 4.7a, c. Another 
common representation of a bone uses triangles. Fig. 4.7b shows the mesh model of 
a human arm, and the associated skeleton consisting of a set of bones. Each bone 
stores the index of its parent and the bone's position relative to its parent. Using this 
information, a complete hierarchical structure can be built, as shown in Fig. 4.7c. 
There are two special nodes in this skeleton tree. The root node always represents 
the origin of the world coordinate system, and has an index 0. The base node is that 
bone in the skeleton which has root as its parent. The position and the orientation of 
the base define the pose of the skeleton in the world coordinate space. 

Bones are not physical structures present in a polygonal mesh, but are only 
animation tools or controlling mechanisms used to transform the mesh in a realistic 
manner. A bone also loosely represents the region of influence of a transformation. 


4.4.1 The Bind Pose 


The hierarchical organization of bones in a skeleton allows the geometric transfor- 
mation for each bone to be defined with respect to its parent. The transformations 


Parent Child 


Base 


Fig. 4.7 (a) A simple joint chain. (b) A skeletal structure for the arm, hand and fingers. 
(c) Modified version of the skeleton in Fig. 4.5b 
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Fig. 4.8 Transformation of a mesh vertex V using the transformations of its bone 


that are associated with a bone are normally a joint angle rotation followed by a 
translation from its parent bone. The translations of the bones, each relative to its 
parent, together define the initial configuration of a skeleton. For this configuration, 
the joint angles are set to 0. The corresponding mesh is said to be in the bind 
pose (Fig. 4.5a). The placement of bones in the skeleton can be obtained by first 
computing the axis-aligned bounding boxes (see Box 3.1 of Chap. 3) of vertex 
groups (defined as in Fig. 4.6), and then determining the joint positions for each box. 
Fig. 4.5b shows an example. For now, we will assume that each vertex is attached to 
one and only one bone. We will consider a more general case of associating a vertex 
with two or more bones, in Sect. 4.6. In the following section, we will see how joint 
angle transformations applied to bones can be transferred to mesh vertices attached 
to them. 


4.4.2 Mesh Vertex Transformation 


Consider a mesh vertex V attached to a bone i in bind pose (Fig. 4.8). In this 
configuration, each bone has an associated matrix B; that defines the transforma- 
tion from the bone's local coordinate space to the skeleton's coordinate space. 
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Fig. 4.9 An example a 
showing transformations y 
using three bones. (a) Bind 

pose and (b) transformed 

pose 


Bone-1 Bone-2 Bone-3 


This transformation depends only on the translations of bones in the hierarchy 
relative to their parents. The process of obtaining this transformation matrix will 
be discussed below. For a given joint rotation by an angle 0, the transformed 
configuration of the bone in the skeleton's coordinate space is represented by 
another bone matrix B’; . To get the transformed vertex W, we transfer the original 
point V from the coordinate space of the mesh (which is the same as the skeleton 
space) back to its bone's local coordinate space, and then apply the joint angle 
transformation to return to the skeleton space. In other words, the vertex V is first 
transformed using the inverse of the matrix B;, then by B’;. The first transformation 
gives the point B; ! V. Applying the matrix B/ to this point yields coordinates of the 
transformed point W. Thus 


W = (B;B; V (4.5) 


The above equation is fundamental to skeletal animation, as it describes how 
transformations applied to a bone i can be propagated to an attached mesh vertex V. 
The matrix B; depends only on the initial configuration of the skeleton, and therefore 
the points B; ! V can be pre-computed and used for the entire animation sequence. 
As an example, we consider the model in Fig. 4.9, and show how it can be 
transformed using a skeleton comprising of three bones. 
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Let d, denote the translation vector used for moving Bone-1 from its local 
coordinate space to the skeleton space. Let d» denote the vector by which Bone-2 
is translated in Bone-1’s coordinate space. The vector d3 similarly represents the 
translation of Bone-3 in the coordinate space of Bone-2. Vertices Vi, V2, V3 are 
attached to Bone-1, Bone-2, and Bone-3 respectively on the mesh in its bind pose 
(Fig. 4.9a). We seek to find the transformed coordinates of these vertices, when 
the skeleton is transformed using joint angles 01, 05, 05 respectively as shown in 
Fig. 4.9b. If we represent translation matrices by T, the initial bone matrices are 
given by 


B, —T(d:) 
B; —T(di) T(d>) = T(d, + d2) 
B; =T(d\)T(d2)T(d3) = T(di + d2 + d3) (4.6) 


When the bones are transformed using the joint angles, the bone matrices for the 
transformed configuration become 

B'; =T(d1) R(01) 

B'; =T(d 1) R(01) T(d2) R(62) 

B'; —T(di) R(81) T(d2) R(0;) T(d 3) R(63) (4.7) 
where R denotes a rotational transformation matrix. Now applying Eq. 4.5, we can 
write the expressions for the transformed vertex coordinates as 

W, —T(di) R(0;) T(-d 1) V 
W —T(di) R(6:) T(d2) R(6;) T(—d; — d2) V 
W; =T(d 1) R(01) T(d2) R(6) T(d3) R(85) T(-di — d2 — d3) V3 — (4.8) 


So far we have assumed that each vertex is associated with only a single bone. 
Section 4.6 discusses a more general case. 


4.5 Vertex Skinning Using Scene Graphs 


The vertex transformations (Eqs. 4.6, 4.7, 4.8) given in the previous section can 
be implemented using a scene graph for the skeleton. The scene graph is slightly 
different to the one we saw earlier in Chap. 3 (Fig. 3.3), in that each group 
node represents a bone with a matrix of the form M = TR defining the relative 
transformation of the bone with respect to its parent. Each bone has a child node 
representing the set of mesh vertices associated with that bone. In Fig. 4.10, 
Bone-1, Bone-2, and Bone-3 form a joint chain in a skeleton, and Sz denotes a 
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Base node 


Fig. 4.10 (a) Scene graph of a joint chain used for the pre-processing phase. (b) The updated 
mesh vertices in the animation phase 


set of mesh vertices associated with Bone-2. The initial bone matrix for Bone-3 
is B = Mi M; Ms. The vectors B;^! V in Eq. 4.5 are obtained in a pre-processing 
phase, where each vertex set is transformed using the inverses of the matrices 
attached to nodes. As shown in Eq. 4.6, these matrices involve only translation 
components, and their inverses (as well as the product of inverses) can be easily 
computed. In the example given in Fig. 4.10a, a vertex V belonging to the set S4 
would be transformed into 


V'2M;'M;MiV-V-di-d5-ds (4.9) 


As the tree is traversed from the root, matrices are combined by pre-multiplying 
the current product by the inverse of the matrix at the node, until a leaf node is 
reached. The vertices in a leaf node are transformed using the product of matrix 
inverses gathered up to that point. Thus the set S4 becomes a new set S3’ after the 
transformation in Eq. 4.9. The transformed set of vertices replaces the original set 
for the animation phase (Fig. 4.10b). 

In the animation phase, matrices at scene graph nodes are updated using the joint 
angles of the bones. The updated matrices are represented by M' in Fig. 4.10b. The 
scene graph is again traversed from the root; matrices are combined, this time using 
post-multiplication, and applied to the vertices at leaf nodes to get the transformed 
mesh vertices. The vertices in the set S4 would transform according to the following 
equation: 


W = Mi! M/ My’ V’ (4.10) 


If the set of vertices attached to each bone can be specified as a range of indices 
(i, j) where i is the start index and j the end index of the set as in Fig. 4.6, then the 
structure of the scene graph can be simplified to a great extent as shown in Fig. 4.11. 
The vertex indices in the pre-processing phase point to the initial vertex list (Vj 
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Fig. 4.11 Simplified scene graph for a joint chain using a vertex index range for each bone 


of the mesh. After the pre-processing phase, they point to the list of intermediate 
vertices {V’} that are used as inputs in the animation phase. The transformed list of 
vertices {W} is used for rendering the mesh after applying joint angle rotations to 


the bones (Fig. 4.11). 


4.6 Transformation Blending 


If every vertex is attached to only a single bone, then transformations applied to the 
bones may cause mesh surfaces to interpenetrate at a joint (Fig. 4.12a, b). 

Figure 4.12b also shows how large flat surface patches can appear at a joint 
when two adjacent vertices move away from each other because of a rotational 
transformation. It is intuitive to transform vertices in the neighbourhood of a joint 
using a combination of bone matrices which influence that joint. If i and j are two 
bones that influence a joint, then a vertex V in the vicinity of the joint may be 
transformed using a weighted combination of the bone's matrices B; and B;. The 
weights w; and w; are usually selected based on the relative distances of the vertex 
from the bones (Fig. 4.13). The final transformed point W (Fig. 4.12c) is obtained as 


W = {w; (B;Bi ') +w; (B;B; V (4.11) 
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Fig. 4.12 (a) A joint formed by two bones, and the attached mesh. (b) Interpenetration of mesh 
surfaces at a joint. (c) Mesh transformation using a combination of two bone matrices 
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Fig. 4.13 Multiple weights associated with vertices for combining bone matrices 


We also require the weights to satisfy the condition w; + w; — 1. A sample 
distribution of weights for mesh vertices of the joint in Fig. 4.12a is shown above 
(Fig. 4.13). 

In general, if n bones with indices 1, 2, ..., n meet at a joint, the vertices 
surrounding the joint may be transformed using a matrix 


M - ww BB; where, Yow; =1. (4.12) 


i=l i=l 


The method outlined above is called transformation blending, and it usually 
produces smooth mesh deformations near joints. However when the angle of 
rotation of a bone is very large compared to its parent, the averaging scheme in 
Eq. 4.11 can produce two types of undesirable artefacts shown in Fig. 4.14. The first 
one is called a collapsing elbow effect, which appears when the angle between the 
axes of two adjacent bones becomes small. In this situation, vertex points on the 
inner edge of the mesh that are located near the joint move towards the centre. 
The second type of artefact is called the candy-wrapper effect, where one of the 
bones is twisted by 180° about its axis. In this case, vertices with nearly equal 
weights get transformed to closely located points near the joint. 
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Fig. 4.14 (a) Collapsing elbow effect. (b) Candy-wrapper effect 


4.7 Keyframe Animation 


The animation of an articulated character model is usually done by specifying a set 
of keyframes that contain the information about the required joint transformation 
parameters at certain discrete points in time. Keyframes are generally predefined by 
an artist or an animator who can clearly specify the motion an object is required to 
produce. The joint angles for a character model at various instances in an animation 
sequence can also be obtained from motion capture systems. Here, the actions 
performed by a human actor are captured through the placement of markers near 
each joint of the body, and their recorded positions used to compute joint angles. 

A keyframe is essentially a time stamp of important transformation parameters 
and, optionally, other attributes such as colour, transparency etc. that are needed to 
render one frame of an animation sequence. As an example, consider a keyframe 
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Fig. 4.15 (a) Model of a stick figure and (b) the joint chains used for its animation. The 
hierarchical structure of links consists of five branches (chains), and 14 internal nodes. A leaf 
node is indicated by a blank square 
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Fig. 4.16 The four primary keyframes used for generating a walk sequence of the stick figure. The 
graphs show the values of some of the joint angles, and a linear interpolation between the values is 
indicated by dotted lines 


animation of the model of a stick figure shown in Fig. 4.15a. This model has five 
joint chains, and a total of 14 joints (Fig. 4.15b). A single configuration or “pose” 
of the model is therefore given by 14 joint angles 06, 01, ..., 015, and the position 
(Xo. Yo, zo) of the root joint. A joint rotation that moves a link forward (towards + z) 
is considered as positive. For example, the elbow joints are constrained to rotate 
the arm only forward, by assigning only positive values for 04 and 05. Similarly 
the knee joint angles (010, 011) are always assigned a negative value. An alternative 
definition for these joint angles can be obtained by viewing them as rotations about 
the x-axis. In this case, the angles at shoulders and elbows will have negative values, 
and the angles at the knees will have positive values. 

For a simple walk sequence for the stick figure, four key-frames are defined as 
shown in Fig. 4.16. These are the primary postures from which the intermediate 
motion can be generated by linear interpolation. In our example, movements of the 
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Fig. 4.17 Commonly used interpolation methods in keyframe animation 


neck, shoulder and wrists are neglected, and hence values of only ten joint angles 
are specified for each keyframe. More complex and realistic movements such as 
running, jumping or performing somersaults can be produced by creating a larger 
number of keyframes using motion capture systems. 

The “in-between” frames of an animation sequence are generated by interpolat- 
ing keyframe values using either step, linear or spline functions. A step function uses 
the previous keyframe values for all subsequent frames until another keyframe is 
encountered (Fig. 4.172). A linear interpolation produces points along line segments 
connecting two consecutive keyframes (Fig. 4.17b). If kı denotes a parameter in a 
keyframe at time £j, and kz denotes the value of the same parameter in the next 
keyframe at time fz, the value k for an in-between frame at time ¢ is given by 


k= (2) n+ (2) ho (4.13) 
t5—ti t — tı 


or equivalently, 


k = (1 — À)k, + Ako, 0<A <1. (4.14) 
where 
t—t 
A= ; 4.15 
d (4.15) 


For smoother motion interpolation, keyframe values are connected using piece- 
wise cubic splines (Fig. 4.17c). Catmull-Rom splines are commonly used for this 
purpose, as they have properties of both C? and C! continuity between consecutive 
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spline segments. Please refer to Chap. 7 (Sects. 7.2 and 7.5) for more information on 
Catmull-Rom and other types of splines that are useful for generating approximating 
curves and surfaces. 


4.8 Sample Implementation of Vertex Skinning 


In Sect. 3.5 we discussed the implementation of a scene graph class. For vertex 
skinning, we will use a highly simplified model where the information attached to 
each node is appended with a vertex index range given by the first and the last indices 
of the range. In this model, there is no need for an object node, and the vertices are 
processed at group nodes only. Listing 4.1 gives the class definition for a skeleton 
node. A documentation of the methods in this class can be found in Appendix C. 
Just like a scene graph node, a skeleton node also stores transformation parameters 
and a list of pointers to its child nodes. 


4.8.1 Skeleton Node 


The primary function of the SkeletonNode is to provide a convenient frame- 
work for representing the bone hierarchy and also to transform the vertex list 
of a mesh model using joint angles specified for each bone. The two functions 
preprocessPhase() and animationPhase() both initiate a recursive 
traversal of the tree to transform entries in the vertex lists as shown in Fig. 4.11. 

It is useful to have a Skeleton class with functions to load a skeleton definition 
and to define joint angles for bones during animation (Listing 4.2). These two 
functions provide the main interface between the classes and the user application. 
A Skeleton object represents the whole skeleton of a mesh model consisting of 
several bones (skeleton nodes). 

The contents of the skeleton definition file are organized as shown in Fig. 4.18. 
The loadSkeleton () function reads in the parameters and builds the hierarchi- 
cal structure. The reference to the root node is available to the application via the 
function getRoot (). 


4.8.2 Skinned Mesh Node 


The SkinnedMesh class encapsulates data and related functions for loading 
a mesh file consisting of vertex and polygon lists, attaching a skeleton, and 
transforming the vertices using the joint angles associated with the bones of the 
skeleton (Listing 4.3). 

As shown in Fig. 4.11, the SkinnedMesh class uses three vertex lists in the 
form of vectors to store the initial vertices of the mesh in bind pose, the intermediate 
set of vertices after the pre-processing phase, and the final set of vertices after 


70 


Listing 4.1 Class definition for a single node of a skeleton 
#include "Matrix.h" 

#include "Point3.h" 

#include <list> 

#include <vector> 

using namespace std; 


class SkeletonNode 
{ 


private: 
list<SkeletonNode*> children; 
int firstIndex,  lastIndex,  parentIndex; 


SkeletonNode* parent; 


4 Skeletal Animation 


float tx, ty, tz, _angleX, _angleY,  angleZ; 


Matrix * matrix, * invMatrix; 
public: 


SkeletonNode(int parentInGx, float tx, float ty, 
float tz, int firstIndx, int lastIndx) 


parent (NULL), 
_tx(tx), _ty(ty), _tz(tz), 
 angleX(0.0), _angleY(0.0), _angleZz(0 


+0), 


 firstIndex(firstIndx),  lastIndex(lastIndx), 


 parentIndex(parentIndx) 
( matrix = new Matrix(); 


 invMatrix - new Matrix(); 
updateMatrices(); } 
^SkeletonNode() {} 


void addChild(SkeletonNode* node); 
void removeChild(SkeletonNode* node); 
void rotateX(float angle); 

void rotateY(float angle); 

void rotateZ(float angle); 


void attachVertices(int firstIndex, int lastIndex); 


void setParentIndex(int parentindex); 


int getParentIndex() const; 
int getFirstIndex() const; 
int getLastIndex() const; 
void initialize(); 

Matrix* getMatrix() const; 


void updateMatrices(); 


Matrix* getInverseMatrix() const; 


vector«Point3*» preprocessPhase(vector«Point3*» vertices); 
vector«Point3*» animationPhase(vector«Point3*» vertices); 


void transforml (vector<Point3*> vertices, 
float tx, float ty, float 
void transform2 (vector<Point3*> vertices, 


tz); 
Matrix matrix); 


applying joint angle transformations. The mesh definition file has a simple structure 
consisting of the list of vertices and polygons. Polygons are specified using vertex 
indices (three indices for triangles and four for quads). The vertex index starts 
from 1. Figure 4.19 gives the mesh definition for a rectangular prism. 

The framework described above also uses the Point3 and Matrix classes for 
various vertex and transformation related functions (see Appendix A). This book's 
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Listing 4.2 Class definition for a skeleton 
#include "SkeletonNode.h" 
#include <vector> 

using namespace std; 


class Skeleton 

{ 

private: 

SkeletonNode* root; 

vector<SkeletonNode*> bones; 
void attachBones(); 

publici 
Skeleton (} 

: root( new SkeletonNode() ) 

{} 

~Skeleton() {delete root; } 

SkeletonNode* getRootí() const; 

void loadSkeleton(const string& filename); 

void rotate(int i, float anglexX, 

float angleY, float angleZ); 

void translateBase(float tx, float ty, float tz); 


Number 


of bones Bone | 


Bone 4 
Parent one 


index 


Range of vertex 
the parent bone r7 indices 


Fig. 4.18 Sample skeleton definition file 


companion website contains the header and implementation files of all the above 
classes. 

An example of a simple application using the skeleton animation framework 
is shown in Listing 4.4. At the initialization stage, both mesh and skeleton 
objects are created, corresponding data loaded from input files, the skeleton 
is attached to the mesh, and the preprocess() function is called on 
the mesh object. This function in turn passes the vertex data to the root 
node of the skeleton via the preprocessPhase () function and gets back the 
intermediate vertices. The display() function performs the animation of the 
mesh by defining joint angles for the bones. In the example, the function call 
skeleton->rotate(3,30,0,-75) is used to rotate the bone with index 3 
by 30? about the x-axis and —75? about the z-axis. The sequence of rotations is 
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Listing 4.3 Class definition for a mesh 


H 


include «vector» 
include «string» 
include "Skeleton.h" 
include "Point3.h" 
using namespace std; 


struct Polygon 


int vertl, vert2, vert3, vert4; 


class SkinnedMesh 


public: 


private: 


enum PolyType {TRIANGLE, QUAD}; 


SkinnedMesh(PolyType polytype) 
_polytype (polytype), 
_ skeleton (NULL) 
{} 
~SkinnedMesh() {} 
void loadMesh(const string& filename); 
void render(); 
void setColor(float colorR, float colorG, float colorB); 
void attachSkeleton(Skeleton* skeleton); 
Skeleton* getSkeleton() const; 


vector«Point3*»  verticesV; 

vector«Point3*»  verticesW; 

vector<Point3*>  verticesVT; 

vector<Polygon*>  polygons; 

PolyType _polytype; 

float _colorR, _colorG,  colorB; 

float  xmin,  xmax,  ymin,  ymax,  zmin,  zmax; 
Skeleton* skeleton; 

void normal(Point3* pl, Point3* p2, Point3* p3) const; 
void preprocess(); 

void transform(); 


pre-defined. The function call mesh->render () is used inside the display loop 
to render the mesh with the transformed vertex coordinates. 


4.9 Summary 


This chapter addressed the problem of animating articulated character models. 
Character models are divided into two main categories: those constructed using 
individual component objects, and those modelled as a single mesh surface. 
The first category of objects requires blending of surfaces at the joints to avoid 
interpenetration of component objects and the appearance of gaps during animation. 
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Number of vertices, 
number of polygons 


Fig. 4.19 A sample mesh definition file 


Listing 4.4 Example of an application using the vertex skinning 
algorithm 

SkinnedMesh* mesh; 

Skeleton* skeleton; 


void initialise() 

{ 
mesh = new SkinnedMesh (SkinnedMesh: : QUAD) ; 
mesh->loadMesh ("HumanModel.txt"); 
mesh->setColor(1.0, 0.5, 0.0); 


skeleton = new Skeleton(); 
skeleton->loadSkeleton("Skeleton.txt") ; 
mesh->attachSkeleton (skeleton); 


l:0f, 1.08; 1.06) % 


H 


glClearColor(1.0f 
glClearDepth(1.0f 


void display() 


Skeleton-»rotate(3, 30, 0, -75); 
mesh-»render(); 


It was shown that Hermite polynomials and cubic Bezier polynomials could be 
effectively used for vertex blending. 

This chapter also presented the vertex skinning algorithm which is a well-known 
method used in skeletal animation. Various aspects of vertex skinning including 
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the transformation of mesh vertices using skeletons, application of scene graphs in 
vertex skinning, and transformations using a combination of bone matrices have 
been discussed in detail. The process of keyframe interpolation has been outlined. 
This chapter also demonstrated the implementation of the vertex skinning algorithm. 

The next chapter introduces the quaternion algebra and transformations that are 
used for interpolating between orientations in three-dimensional space. Quaternions 
have a very important role in animation sequences where generic rotational trans- 
formations are applied to objects. 


4.10 Supplementary Material for Chap. 4 


The folder Chapter4/Code on the companion website contains code examples 
demonstrating the application of concepts introduced in this chapter. A brief 
description of these programs is given below. 


1. SkeletonNode.cpp 


ee class 
SkeletonNode 


This class implements the basic functionalities of a scene graph for skeleton 
animation as detailed in Sect. 4.8. The class documentation can be found in 
Appendix C. 


2. SkinnedMesh.cpp 


Additional files: 


class 
SkinnedMesh 


This class supports several functions for loading and rendering a skinned mesh 
file. A brief description of the class can be found in Sect. 4.8.2, and the class 
documentation in Appendix C. 
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3. VertexBlending.cpp 


Additional files: 
None 


This program generates a blending surface between two cylinders using 
Hermite interpolation. Clicking the left mouse button starts the rotation of one 
of the cylinders. Use up or down arrow keys to increase or reduce the weight a 
of the tangent vectors. Press left or right arrow keys to change the view direction. 


4. TwoBoneTransform.cpp 


Additional files: 


None 
None 


The program demonstrates the collapsing elbow and candy wrapper effects 
seen in transformations using a combination of two bone matrices. Use left 
and right arrow keys to increase or decrease the bending angle (rotation about 
the z-axis). Use up and down arrow keys to decrease or increase the twist 
angle (rotation about the x-axis). The spread of the weights can be increased 
by pressing the ‘s’ key, and decreased by pressing the ‘a’ key. 


5. HumanModel.cpp 


Additional classes: 
Mesh 
Skeleton 
SkeletonNode 
Point3 
Matrix 


This program uses the vertex skinning method to transform a mesh based 
on transformations applied to a skeleton. It requires two input files, *Human- 
Model.txt” (mesh definition) and *Skeleton.txt" (skeleton definition). The bone 
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indices are defined as given in Fig. 4.5. The bone transformations are defined 
inside the display () function of the program. Use left and right arrow keys to 
change the view direction. 


4.11 Bibliographical Notes 


Both vertex blending and vertex skinning are often used synonymously in computer 
graphics literature. In this book, vertex blending refers to an interpolation method 
between polyhedral surfaces, while vertex skinning refers to a completely different 
method of animating a mesh using a skeleton. The process of constructing blending 
surfaces between polyhedral objects is often referred to as polyhedral vertex 
blending. Such methods were originally introduced for Computer Aided Design 
(CAD) applications. Bajaj and Ihm (1992) gives the fundamental concepts for 
designing blending surfaces with Hermite polynomials. A description of parametric 
cubic curves and surfaces generated using Hermite polynomials can be found in 
Foley (1994, 1996), and Angel (2008). Cubic interpolation methods using Hermite 
curves are discussed in Eberly (2007) and Moller et al. (2008). 

Skeleton animation is an important technique in game programming and char- 
acter animation. Books such as Astle (2006), Moller et al. (2008) and Erleben 
(2005) provide a description of skeleton based mesh transformation methods. Eberly 
(2007) gives an outline of the vertex skinning method. The implementation aspects 
of vertex skinning are presented in Lander (1998) and Kavan (2003). 
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Chapter 5 
Quaternions 


Overview 


In computer graphics applications, quaternions are used to represent three- 
dimensional rotations. They provide some key advantages over the traditional 
way of defining generic rotational transformations using Euler angles. Quaternions 
are also extremely useful for interpolating between two orientations in three- 
dimensional space. Keyframe animations requiring orientation interpolation 
therefore find a very convenient mathematical tool in quaternions. 

This chapter gives an overview of the algebra of quaternions, the geometrical 
interpretation of quaternion transformations, and quaternion based linear and 
spherical interpolation functions. A comparison of rotation interpolation methods 
using Euler angles, angle-axis representations, and quaternions is presented. The 
extension of quaternions to eight-dimensional dual quaternions and their usefulness 
in representing general rigid-body transformations are also discussed. 


5.1 Review of Complex Numbers 


Quaternions are hyper-complex numbers of rank 4, and therefore it is useful to 
review some of the basic concepts related to complex number algebra to gain 
a better insight into quaternion operations. Even though a complex number z is 
commonly represented in the form a + i b where i= J/—1, and a, b are respectively 
the real and imaginary parts of z, we will use the two-tuple notation (a, b) for z. 
With this notation, we can write 1 — (1, 0), and i — (0, 1). These two-dimensional 
vectors (1, 0) and (0, 1) form an orthogonal basis for the complex space, where any 
number z = (a, b) can be expressed as their linear combination a (1, 0) 4- b (0, 1). 
The operations of addition, subtraction and multiplication in the field of complex 
numbers are defined as follows: 


(a1, b1) x (a2, b2) = (a1 x az, bi + bp) (5.1) 


R. Mukundan, Advanced Methods in Computer Graphics: With examples in OpenGL, T 
DOI 10.1007/978-1-4471-2340-8. 5, © Springer-Verlag London Limited 2012 
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Fig. 5.1 Multiplication by a y 
unit complex number has the 

effect of rotation of vectors 

and points about the origin on 

a two-dimensional plane 


(x cosó -y sind, x sinóty cosó) 


(ai, bi)(a2, b2) = (aao — bibo, aybz + arb) (5.2) 
c(a,b) = (ca, cb), (5.3) 


where c is a real number. The multiplication rule given in Eq. 5.2 establishes the 
fact that ? = (0,1) (0,1) = (—1, 0). The complex conjugate of z = (a, b) is given by 
z* = (a, —b). The magnitude of z is a positive real number defined as 


el = va? +B? (5.4) 
Using the multiplication rule, we find that 
z* = |z|? =a + b? (5.5) 


If a complex number z has a unit magnitude, then zz* = |. This implies that for a 
unit complex number, z* is the multiplicative inverse of z. All unit complex numbers 
can be expressed in the general form 


z = (cosó, sind) (5.6) 


Consider any vector (or point) p = (x, y) in a two-dimensional coordinate system. 
If we treat p as a complex number, and multiply it by the unit complex number z 
given above, the product zp can be evaluated using Eq. 5.2 as follows: 


p’ = (cosó, sind) (x, y) 
= (xcosó — ysinó, xsinó + ycosó) (5.7) 


The transformed vector (or point) p' has the same magnitude as p, and can be 
obtained by rotating p about the origin by an angle 6 (Fig. 5.1). The unit complex 
vector therefore represents a rotation in two-dimensional space. 

The geometrical interpretation of unit complex numbers as rotation operators 
forms the basis for the framework for an extended set of hyper-complex numbers 
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called quaternions. We will see shortly that unit quaternions represent three- 
dimensional rotations. In the following section, we introduce the algebra of 
quaternion numbers. 


5.0 Quaternion Algebra 


We have seen above that the field of complex numbers have 1 = (1, 0), i = (0, 1) 
as the orthogonal basis. The quaternion set has an extended orthogonal basis 
consisting of four elements 1 = (1, 0, 0, 0), i= (0,1, 0, 0), j= (0,0, 1, 0, k= (0, 
0, 0, 1). Thus a quaternion Q = (qo, q1, q2, q3) has an equivalent representation 
qo + qii + qoj + q3 k, where the quaternion components q; are all real values. The 
term qo is called the scalar part of Q, and the 3-tuple (q1, q2, q3) the vector part. The 
operations of addition, subtraction and scalar multiplication are defined as follows: 


(Po, Pi, P2, P3) = (qo.di.d2. 93) = (po + qo, Pi +41, P2 = q2, P3 t q3) (5.8) 


C(do. 1,92, 93) = (Cqo, C41, €q2. C93), (5.9) 


where c is any real number. Analogous to Eq. 5.2, the quaternion product is given by 


(po. Pi; P2, pa) (qo. q1, q2. 3) 
= (Pogo — Pidi — Pa2d2 — P3943, Podi + pido + pads — P342, 
Poq2 — pid3 + pado + psdi, Pog3s + piq2 — podi + pasqo) (5.10) 


From the above definition of a quaternion product, it is obvious that quaternion 
multiplication is not commutative. That is, for any two quaternions P = (po, p1, p2, 
P3), Q = (qo. q1, q2, 93), the product PQ need not necessarily be the same as QP. 
If we denote the vector part of P by v = (pi, p2, p3) and the vector part of Q by 
w = (q1, q2, 43), then Eq. 5.10 becomes 


(po. V)(qo. w) = (Pogo — Y ew, pow + Gov + v x w) (5.11) 


where vew denotes the dot product and vxw the cross product of the two vectors. 
The right-hand side of Eq. 5.10 when treated as a column vector, can be conveniently 
expressed as a product of a matrix of elements of P and a vector containing elements 
of Q as given below. 


Po —pi —p2 —P3 qo 

PỌ = Di Po —ps pa qı (5.12) 
P2 P3 po —pi q2 
P3—P2 p Po 93 
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or, equivalently as 


qo —d1 —d2 —43 Po 

pos T P B T gne (5.13) 
q2 —d3 do qı p2 
q3 q2 —qı 4o P3 


From Eq. 5.10, we can derive the following properties satisfied by the quaternion 
basis: 


ijj--ji-k 
jk — —j =i 
ki =-ik = j (5.14) 


Quaternions also form a commutative group under addition, where (0,0,0,0) is 
the identity element. Quaternion multiplication is associative, and distributes over 
addition. If P, Q, R are any three quaternions, 


(PQ)R = P(QR) 
(P+ Q)R — PR 4- OR 
P(Q + R) 2 PQ + PR (5.15) 


The conjugate Q* of the quaternion Q = (qo, q1, q2, q3) is defined as 


Q* = (qo. — 41. —42. —43) (5.16) 


Thus, if Q = (qo, w), then Q* = (qo, —w). Also, Q + Q* —2qo. The magnitude 
(also called the length, or norm) of Q denoted by |Q], is 


IOI = at at cado qi (5.17) 
By taking the magnitude of the quaternion product in Eq. 5.10 we get 
|PQ| = |P||Q| (5.18) 


Using Eq. 5.11, it is easy to find that 


00*=0*0 = |Q}. (5.19) 
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By dividing the above equation by |Q|?, we get the equation for the quaternion 
inverse. If we denote the quaternion inverse of Q by Q^, then 


ER 


dam 5.20 
Q Tae (5.20) 


A quaternion Q can be normalized to a unit quaternion by dividing each of its 
components by the length |Q| given in Eq. 5.17. A unit quaternion satisfies the 
following equations: 


|Q|= 1. 
qo + qi? + q? Tq - 1. 
-1 * 
Q =Q (5.21) 


If the real part go of a quaternion is zero, it represents a vector (qi, qo, qa) in 
three-dimensional space. Such a quaternion that has the form (0, q1, q2, q3) = (0, q) 
is called a pure quaternion. Similarly, quaternions of the type (a, 0, 0, 0) with the 
vector component zero are called real quaternions. The algebra of real quaternions 
is the same as that of real numbers. Similarly, quaternions of the type (a, b, 0, 0) 
behave exactly like complex numbers (a, b). 


5.3 Quaternion Transformation 


A special type of quaternion product in the form QPQ* plays an important role in 
three-dimensional transformations. We have just seen that a vector p in the three- 
dimensional space corresponds to a pure quaternion P — (0, p). An interesting fact 
that leads to the notion of a quaternion transformation is that given any quaternion 
Q and a pure quaternion P, the product P’ = QPQ* is also a pure quaternion. Thus 
QPQ* can be viewed as the transformation of a pure quaternion P = (0, pi, p2, p3) 
using another quaternion Q. We can derive the matrix form of this transformation by 
using Eq. 5.13 for obtaining the matrix expression for PQ* and then using Eq. 5.12 
for getting the final product Q(PQ*). 


qo —d1 —d2 —43 qo qi R2 43 Po 
OPQ* = d do —93 42 —d1 do —9 42 Di (5.22) 
q2 (Q3 Go —41 —q2 q3 Go —4i p2 


q3 —42 qı qo —43 —d2 qı qo P3 
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The following matrix equation immediately follows by multiplying the two 
matrices together, and setting po — 0: 


0 1 0 0 0 0 
P| | 0 +a -i-a 2Co +q) 2(Gog2t+aig3) || p 
pr} | 0 2(dds--did) G-92+-43 2(-dodi-- 4:43) || P3 
P 0 2(—qoqo--qiq3)  2(qodi-- qo433) q-q? -q2 +42 IL p 

(5.23) 


This equation defines the quaternion transformation of a three-dimensional point 
(or vector) p = (pi, p2, p3) to another three-dimensional point (or vector) p' = (pi', 
P2', p3'). An alternative form of the equation can be derived as follows: 


QPQ* = (qo. w)(0. p)(qo. —w). (5.24) 
where w = (qi. q2, q3) . Using Eq. 5.11 to expand the product term, we get 
QPQ* = (0.qo p + w(p ew) + 2qo(w x p) + w x (w x p)) (5.25) 


The above equation proves that the transformation of P is also a pure quaternion. 
We can therefore write 


p' = qo p + w(p ew) + 2qo(w x p) + wx (wx p) (5.26) 
Further simplification of the right-hand side using vector algebra gives 
p' = (qo? — w^) p + 2w(p ew) + 2qo(w x p) (5.27) 
where, w? = |w}? = qi? +q?  q3?. 


It should be noted that QPQ* generally is not a scale-preserving transformation 
because 


IP'| =| QIP] (5.28) 
If we impose the constraint that Q is a unit quaternion (i.e., |Q| = 1), we get a 


scale-invariant (or length-preserving) transform. With this additional criterion, we 
can also write the inverse quaternion transform in a concise form as 


P = Q*P'Q (5.29) 
We also note that when P is the zero-quaternion (0, 0, 0, 0), so is P’. Therefore 


the origin is a fixed point of the transformation. A length-preserving transformation 
with a fixed point is a rotation. In the following sections we will attempt to find a 
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geometric interpretation of the quaternion transformation as a pure rotation in three- 
dimensional space, and express the components of a unit quaternion in terms of the 
angle and the axis of rotation. 


5.4 Generalized Rotations 


Before we further analyze the transform properties of quaternions, it would be 
worthwhile to review some of the key concepts relating to general three-dimensional 
rotations. 

Any composite transformation that preserves length, angle and area is called a 
rigid-body transformation. If a rigid body transformation has also a fixed point 
(pivot), then it is a rotation. A rotation can be measured in terms of the angular 
deviation of an orthogonal right-handed system fixed on the rotating body, with the 
origin of the system at the fixed point of rotation. In Fig. 5.2a, Ox, Oy, Oz are the 
axes of an orthogonal triad before rotation, and Ox, Oy, Oz, denote the transformed 
axes directions after a rotation about O. The coordinate reference frame is inertially 
fixed and is represented by X, Y, Z axes. 

A general rigid body transformation of an object without a fixed point can 
be treated as a rotation followed by a translation. Such a transformation can be 
equivalently performed by first carrying out a rotation that aligns the axes parallel to 
the final directions, followed by a translation that moves the fixed point O to its final 
position O, (Fig. 5.2b). While any translation can be unambiguously represented by 
a three component vector, a general rotation may be specified in several ways. In 
the following, we consider the Euler angle and angle-axis representations of three- 
dimensional rotations. 


Fig. 5.2. (a) A generalized rotation with a fixed point O that transforms the directions of body- 
fixed axes from O(x, y, z) to O(x,, yi, zi). (b) A general transformation without a fixed point 
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5.4.1 Euler Angles 


The Euler's theorem on rotations states that any general rotation can be performed 
using a sequence of elementary rotations about the coordinate axes passing through 
the fixed point. The theorem further states that if no two successive rotations is 
about the same axis, then the maximum number of rotations needed to achieve the 
transformation is three. Thus any rotational transformation can be represented by 
a sequence of three rotations about mutually independent axes. These angles are 
called Euler angles. Before defining an Euler angle representation, we need to fix 
the sequence in which the rotations are performed. If we denote rotations about the 
X-axis by y, rotations about Y by $, and rotations about Z by 0, a set of Euler angles 
can be defined using any of the following 12 sequences: 


yho gow Oye 
vod $ y 0 Oo 
Pip Vo $1049» Oi 0» 
Vib Y2 bi br Oih 


The Euler angle sequence {y $ 0} represents a rotation about X followed by a 
second rotation about Y, followed by a third rotation about the Z axis. The sequence 
{pı Y $2} gives another Euler angle representation in terms of a rotation about the Y 
axis, followed by a second rotation about the X axis, and then a third rotation again 
about the Y axis. The six sequences where each axis is used exactly once are called 
proper Euler angles. 

The transformation matrix for the {y $ 0} sequence is obtained by concatenating 
the transformation matrices as shown below. 


x | cos@ —sinó 0 0 cosg 0 sing 0 1 0 0 0 x 
y | | sinf cos0 0 0 0 1 0 0 0 cosy —siny 0 y 
z 0 0 10 —sing 0 cosy 0 0 siny cosy 0 z 
1 | 0 0 0 1 0 0 0 | 0 0 0 1 1 

[ cosQ cosÓÜ sin sing cos0 — cos y sin cos y sing cos 0 + sin y sin 0 x 

_ | cosgsiné sin y sing sinO +cosycos@ coswsingsin# —sinycos@ 0 y 

— sing sin y cos o COS V cos o 0 z 

0 0 0 1 1 

(5.30) 


The above equation can be interpreted as the transformation of any point (x, y, z) 
to (x’, y, z’) in a fixed coordinate frame. This interpretation does not use any 
information pertaining to body-fixed axes. On the other hand, if we assume that 
X, y, z represent the body-fixed axes which initially coincide with the coordinate 
reference axes X, Y, Z, respectively, Eq. 5.30 can be viewed as the transformation 
of a point from the moving body frame to the fixed coordinate reference frame. The 
Euler angle representation described above (and shown in Fig. 5.3) used rotations 
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Fig. 5.3 An extrinsic composition of Euler angle rotations performed using the sequence 


iv. dg, 0j 


Fig. 5.4 An intrinsic composition of Euler angle rotations performed using the sequence (v, $, 0) 


that are performed about the fixed principal axes directions X, Y, Z of the reference 
frame. Such a transformation is called an extrinsic composition of rotations. 

An intrinsic composition, on the other hand, uses rotations about body-fixed axes 
whose directions change in the reference frame after every rotation. For example, 
an aircraft orientation is defined in this manner. In Fig. 5.4, the yaw rotation y is 
performed about the x-axis, the roll rotation $ about the transformed body y-axis, 
and the pitch rotation 0 about the transformed body z-axis. For this sequence 
of intrinsic composition of rotations, the transformation from body frame to the 
coordinate reference frame is given by 


X 1 0 0 0 cos@ O0 sing 0 
Y |__| 0 cosy —siny 0 0 1 0 0 
Z| |O siny cosy 0 —sing 0 cosó 0 
1 0 0 0 1 0 0 0 1 


cosÜ —sin 


0 

sinô cosd 0 

0 0 1 (5.31) 
0 


0 
0 
0 
0 0 1 


— o x 


86 5 Quaternions 


E 1 E» 09 09 (9 9 y & 
BR 620 6 E o o ^ 


Fig. 5.5 Two different Euler angle interpolation sequences generated for the same initial and target 
orientations 


Fig. 5.6 Transformation of a 
vector under a general 
rotation about the origin in 
three-dimensional space 


u = (l, m,n) 


A three-dimensional orientation can be represented in different ways using 
different Euler angle sequences. Even if we keep the sequence fixed, certain 
orientations can have more than one set of Euler angles. For instance, using the same 
sequence {y 0}, both (—45, —80, 0} and (135, —100, —180} represent the same 
transformation. This can be verified by evaluating the product matrix in Eq. 5.30 for 
the two sets of angles. The non-uniqueness of the Euler angle representation also 
means that you may not get a unique interpolation path between two orientations 
(Fig. 5.5). 


5.4.2 Angle-Axis Transformation 


The Euler's theorem concerning three-dimensional rotations states that any number 
of rotational transformations with a single fixed point applied to an object can be 
replaced by a single rotation of the object about an axis passing through the fixed 
point. The axis is often called the equivalent axis of rotation. Any orientation of an 
object with the origin as a fixed point can therefore be specified using an angle of 
rotation ô and an axis of rotation given by a unit vector u = (l, m, n). In the following 
discussion, we assume that the axis of rotation passes through the origin. Figure 5.6 
depicts the rotational transformation applied to a vector p (or a point P). 

If we denote the projected lengths of the vector p along directions of u (axis of 
rotation) and s (perpendicular to axis of rotation) by a and r respectively, we can 
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write p = au + rs, where a = peu. During any rotation of the vector p about the axis 
u, both these projected distances a and r remain constant. If t denotes the vector 
orthogonal to both u and s, the transformed vector direction p’ can be written as 
p' — au + (r cos à)s + (r sin ó)t 
= au + (p — au)cos 6 + (ux p)sin ó 
= p cosó + (1 — cosó)(p eu)u + (u x p)sin 6 (5.32) 
The above equation is the well-known Rodrigues' rotation formula. The matrix 


version of the Rodrigues’ formula can be derived by defining a 3 x 3 skew- 
symmetric matrix Ux as 


0 -nm 
Up= n 0 —l |, (5.33) 
—m l 0 


and replacing u, p, p' by the corresponding column vectors: 


l x x! 
U-|mi|. p-2|»vy|. p'—- |y |. (5.34) 
n Z Zz 


With the above notations, the vector cross-product uxp has an equivalent matrix 
representation (Ux) p. It can also be easily verified that the term (pew) u in Eq. 5.32 
is equivalent to the matrix (UUT) p. Thus we get 


p! = (I cos 8 + (1 — cos 8)UU" + Ux sinó) p (5.35) 
Noting that 
U$ = UU" + I (5.36) 
Equation 5.35 can be written in an alternate form as below. 


p'— (1+ (1 —cos8)UX + Ux sind) p (5.37) 


Equation 5.35 can also be written in the expanded matrix form as follows 
for defining the rotational transformation of a point P expressed in homogeneous 
coordinates: 


x 17(1—cos8) -Fcosó | Im(1— cosó) —nsinó nl(1— cos8) J- msinó 0 x 
y _ | Im(1— cos ô) + n sinô m?(1— cosó)-Fcosó mn(1—cosé)—J/siné 0 y 
z nl(l— cosĝ)— msinó mn(1— cosô)+ lsin mn?(1— cosó) --cosó 0 z 
1 0 0 0 1 1 
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Let us consider the problem of computing the equivalent angle and axis of 
rotation from a transformation matrix. Given a general 4x4 rotation matrix in the 
form 


moo Mo, mo» O 
mio My mj» O 


(5.39) 
M2 m», m», 0 
0 0 0 1 
we get the following equations using the matrix elements from Eq. 5.38: 
moo + M1, + mà» = 1 + 2cosó 
m — mM = 2l sind 
Io) — M = 2m sind 
M19 — Mo; = 2nsinó (5.40) 


From the above equations, we can derive the expressions for angle and axis of 
rotation as follows: 


[mi m my? + (moz — mas? + (io — mor)? 


ô = tan 
moo + mi + m»-1 
l= m2) — mi» 
2 sinô 
mo» — M20 
m-————sÀ— 
2 sinó 
mio — Moi 
n= ——— 5.41 
2 sinô ( ) 


In the next section, we will establish the equivalence between an angle-axis 
transformation and a unit quaternion transformation of the form QPQ* where P 
is a pure quaternion (0, p). 


5.5 Quaternion Rotations 


We will now try to represent the rotational transformation in Fig. 5.6 by a unit 
quaternion Q — (qo,w), where the vector component w of the quaternion is along 
the axis of rotation. Therefore we have 


w = ku, for some constant k. (5.42) 
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We saw earlier that a vector p can be transformed into another vector p’ using 
a unit quaternion Q and the result of this transformation is given by Eq. 5.27. 
In the previous section, we considered an angle-axis transformation of a vector p 
given by Eq. 5.32. We find a striking similarity between the two equations, which 
suggests that the quaternion transformation in Eq. 5.27 is indeed an angle-axis 
transformation. Equating the corresponding terms in both the equations, we find that 


qo) — w° = cos 
2qok = sind 
1 — cosó = 2k? (5.43) 


From the above equations, we can see that k= sin(ó/2), and qo =cos(6/2). 
Therefore the unit quaternion that represents the rotation in Fig. 5.6 is given by 


ô . 6 . 6 . 6 
Q- (5. l sin 5, nsu n sin `) (5.44) 


This result is fundamental to the theory of generalized rotations, as it provides 
a direct mechanism for converting angle-axis representations of three-dimensional 
rotations into unit quaternions. From this equation, we can also derive the relation- 
ship between the components of any unit quaternion Q = (qo, qi, q2, q3) and the 
parameters of rotation it represents. The angle of rotation is given by 


[ vat tai as 


ô = 2tan- (5.45) 
90 
and the unit vector along the axis of rotation (I, m, n) can be obtained as 
| o 
yi 4$ di 
q2 
m= ——————— 
didi 4i 
q3 
n= —— 
qi + q "b q3 (5.46) 


Replacing 6/2 with ô in Eq. 5.44, we can summarize our discussion above as 
follows: 

Any unit quaternion Q can be expressed in the form Q = (cosó, u sind), and it 
represents a rotation by an angle 26 about a unit vector u passing through the origin. 
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5.5.1 Quaternion Transformation Matrix 


From the above discussion, we can conclude that if Q is a unit quaternion, 
then Eq. 5.27 gives a rotational transformation of a vector p — (x, y, z, 0). This 
transformation equation could also be written in the conventional matrix form as 
shown below: 


x! 1—242—24q2 2qıq2 —2qods 24193 + 2qoq2 0 | | x 
y | _ | 24142 + 2409s 1—24; —243 2q2q3—-2Gom O | | y (547) 
g 24193 — 2qog2 24293 + 2qoq1 1—24? —2q42 0 | | z 
0 0 0 0 1} [0 


The same transformation matrix can be applied to transform a point P = (x, y, 
z, 1) to another point P' — (x', y', z', 1) using the quaternion Q. The quaternion 
transformation matrix in Eq. 5.47 is orthogonal, meaning that its inverse is the same 
as its transpose. The matrix also has some very useful properties. If we equate this 
matrix to a general 4 x 4 matrix given in Eq. 5.39, we can find that the following 
relationships hold among the matrix elements: 


moo + mii + mo) + 1 = 4qo 
m» — M2 = 44041 
mo — M% = 44042 
mio — moi = 49043 (5.48) 


The above equations are useful for extracting the quaternion elements from a 
given 4 x 4 rotational transformation matrix: 


Al + moo +m 4 ma» 


qo = 2 
_ M21 — M12 
qı = EE 
_ Moz — Mọ 
q? = dao — 
a (5.49) 
4qo 


We will choose only the positive value of the square-root for computing qo. 
A negative value for qo will change the sign of all remaining components and yield 
the quaternion —Q in place of Q. Shortly (Eq. 5.57) we will see that both Q and 
—Q represent the same rotation, and therefore we can safely impose the constraint 
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that the sign of qo is positive, and compute the remaining components from it. Note 
also that the above equations are valid only when qo Z 0. If go — 0, then the angle 
of rotation 6 = +180°, and the matrix in Eq. 5.47 becomes a symmetric matrix. For 
this special case, the remaining quaternion elements can be derived as follows: 


om 


qı = sign (mz, — M12) 2 


—— e] 


q2 = sign (Mmo — M20) 2 


(5.50) 


/1—mo-mi ame 
2 


q3 = sign (mio — moi) | 


If a point (or a vector) P is first transformed by a quaternion Q, and then by 
a quaternion Qo, the resulting point (or vector) P’ is obtained by applying the 
transformation formula twice: 


P' = QY(Q1iPQ1*)Q5* = (00) P(Q20\)* (5.51) 


The above equation shows that the composite rotation is given by the quaternion 
product Q5Q,. Generalising this result, a series of rotational transformations 
performed using unit quaternions Qj, Q».... Q; in that order, is equivalent to 
a single rotational transformation produced by the combined product quaternion 


(Q; ...Q2Q1). 


5.5.2 Quaternions and Euler Angles 


In this section, we explore the relationship between unit quaternions and Euler 
angles. Using Eq. 5.44, we can represent elementary rotations about X, Y, and Z 
axes by angles y, $, 0 respectively, as follows: 


Qy — c ¥ sin¥, 0, 0) (5.52) 
Oy = (cos A 0, sin, 0) (5.53) 
Qz = (cos À 0, 0, sin ;) (5.54) 


A sequence of Euler angle rotations (vy, $, 0) is equivalent to the quaternion 
product Qz7QyQx. We will denote this product by Qp. Using the quaternion 
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multiplication rule in Eq. 5.10, we can easily express the components of Qz in terms 
of the Euler angles. For convenience, the four quaternion components are arranged 
as a column vector in the equation below. 


«(2 ()() (2) (2) (2) 
(e (BC) 
o(a (telae) 
«(e )(9)-«()« (2) 


Conversely, given a unit quaternion Q = (qo, q1, q2, q3), We can compute the 
equivalent Euler angle representation by comparing the elements of the quaternion 
transformation matrix and the Euler angle transformation matrix. As an example, by 
equating the corresponding elements from only the first column and the third row 
of the matrices in Eqs. 5.47 and 5.30, we get the following expressions for the Euler 
angles w, $, 0 in terms of quaternion components: 


"A (5 (qoqi + P 
1-241 - 243 


Or 


$ = sin! (2qoq2 — 2q1q3) 


a (2¢ + 
ötan GS) (5.56) 


There are many other ways in which the above parameters can be obtained by 
comparing the remaining elements of the two matrices. However, each derivation 
has its own set of singularities that need to be handled as special cases. For example, 
the unit quaternion 


1 1 
ied) 
42. 42 
presents a singularity for y, with both the numerator and the denominator of the 
first equation in Eq. 5.56 becoming zero. 


5.5.3 Negative Quaternion 


In this section, we consider another geometrical property of quaternions, taking Qz 
(Eq. 5.54) as an example. Figure 5.7 shows the plot of the first and the fourth non- 
zero components of Qz as the rotation angle 0 is varied over two cycles from 0° 
to 720°. 
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-1.5 


Fig. 5.7 Plot showing the variation of quaternion components with rotation angle 


Figure 5.7 shows that one cycle in quaternion space takes two revolutions in the 
Cartesian coordinate space. This means that two rotations by angles 0 and 360 + 0 
that are geometrically equivalent, can have different quaternion representations. If a 
unit quaternion Q is given by Eq. 5.44, then replacing 8 with 360 + 8 we get, 


, 360 + ô . 360+ ô . 360+ ô . 3604-6 
Q = | cos 2 , lsin , sin —, —, nsin —; 


(- cos 7 —] sin y —m sin 7 —n sin 5) 


=-Q (5.57) 


The above equation shows that both Q and —Q represent the same rotational 
transformation. In the next section, we will consider the problem of interpolating 
between two orientations (which we had briefly touched on while introducing Euler 
angles), and then use some of the properties of quaternion rotations discussed above 
to define quaternion based interpolation methods. 


5.6 Rotation Interpolation 


Animation sequences commonly use interpolated values between two poses. A pose 
defines the position and orientation an object. Position interpolation can be carried 
out either by interpolating between the corresponding coordinate values, or by fitting 
parametric curves (splines) through the points. However, interpolation between 
two orientations in three-dimensional space need not always produce a smooth 
transition from one orientation to another. Depending on the mechanism we use 
for representing rotations, we can get completely different interpolation sequences 
between the same initial and target orientations. Generally, one would prefer an 
interpolation that yields an optimal path that gives minimum rotation and uniform 
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Fig. 5.8 Initial configuration and two orientations of an object 


angular velocity between two configurations. In this section, we will compare 
different interpolation methods using different representations of rotation we have 
considered so far, and establish that quaternions have a clear advantage over others. 

We define orientation as the result of a rotational transformation from the initial 
configuration of an object to its current configuration. A configuration is uniquely 
specified by an orthogonal system of axes fixed on the object. Some of these 
concepts are explained in a little more detail below with the help of an example. 
Figure 5.8 shows a simple model, “Hammer”, constructed using four primitives, a 
cylinder, a cone, a sphere and a cube. The figure also shows two orientations of this 
model. 

The initial configuration of the object defines its orientation when no rotational 
transformation is applied. In this configuration, an orthogonal right-handed system 
of body-fixed axes Oxpypzg coincides with the inertially fixed coordinate reference 
axes OXYZ. Without any loss of generality, we can assume that all rotations take 
place about the origin. The unit vectors along body fixed axes have components 
xg = (1, 0, 0), yp = (0, 1, 0), zg = (0, 0, 1) in the initial configuration. An orientation 
can be uniquely defined using the transformed components of these three vectors. 
For example, Orientation-1 in Fig. 5.8 is defined by the vectors xp — (0, 0, 1), 
yp — (—1, 0, 0), zg = (0, —1, 0). During any rotational transformation the tips of 
these vectors move on a unit sphere centered at the origin (Fig. 5.92). 

The rotational transformation of an object can thus be visualized using the trace 
of the unit vectors along the body-fixed axes on a unit sphere. Any unit vector has 
a spherical parameterization in terms of its azimuth (or longitude) o, and elevation 
(or latitude) £ (Fig. 5.9). The variation of the tip of a vector v = (xy, Yy, Zv) on a unit 
sphere can be conveniently represented as a 2D-graph of the values (a, 6) computed 
as follows: 


Duif» 
B = tan™! (2) (5.58) 
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Fig. 5.9 Spherical parameterization of rotations: (a) Movement of unit vectors attached to body 
axes during a rotation of the object. (b) Parametric representation of unit vectors on a sphere 


Table 5.1 Graph values 


1 à Orientation-1 Orientation-2 
(a,b) of the two orientations 8 E E 
in Fig. 5.8 XB (0, 0) (+180, 0) 
MI (—90, 0) (~, 90) 
ZB (^^, —90) (90, 0) 
0—1 0 001 
Transformation matrix 0 0 —I 0 10 
10 0 —100 


~indicates an indeterminate value 


We call the above method of representing the three-dimensional variations of 
a unit vector as the o/f-graph method. Note that when 6 = +90°, the value of a 
is indeterminate. In the following sections, we will use af-graphs of the body- 
fixed axes for a given interpolation sequence to compare the paths generated by 
different methods. For the example given in Fig. 5.8, the graph values (in degrees) 
of Orientation- 1 and Orientation-2 are shown in Table 5.1. The variation of a graph 
between the two points will help us visualize how a sequence of rotational transfor- 
mations operates on an object for transforming it from one orientation to the other. 

Another method for visualizing three-dimensional rotations is to show a small 
triangle (see Fig. 5.9a) at the position of one of the body axes (say, zg) on the 
unit sphere, oriented towards another axis (say yg). The triangle uniquely represents 
the three-dimensional orientation of the object. Triangles displayed at equal time 
intervals during a rotational transformation will clearly show the movement of an 
axis of interest, and also indicate the spin of the object about that axis (see Fig. 5.10). 


5.6.1 Euler Angle Interpolation 


Let us first consider the interpolation between two orientations represented using 
Euler angles. For our example, we will use the Euler angle sequence (v, $, 0} 
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Fig. 5.10 Interpolation sequence generated using Euler angles (90, —90, 0) and (0, 90, 0) 


introduced in Sect. 5.4.1. Given two sets of Euler angles (v1, $1, 01) and (V2, d», 
051, all intermediate sets can be obtained using a linear interpolation between the 
corresponding Euler angles: 


y = (1— yi t ty» 
$ = (1—1)91 + tho 
0—(1—1)0; +th,0<t <1. (5.59) 


The transformation matrix in Eq. 5.30 then defines the rotation from the initial 
configuration to the intermediate orientation. Earlier in Fig. 5.5, we saw examples 
of interpolation sequences generated in this manner. For the example given in 
Fig. 5.8, Orientation-1 is defined by Euler angles (y = 90, $; =—90, 6; = 0}, 
and Orientation-2 by (y? = 0, $» = 90, 0; = 0}. The wf-graph for the interpolation 
sequence is given in Fig. 5.10. For this specific example, linear interpolation in 
the domain of Euler angles also generates a perfect linear interpolation in af- 
space, consisting of equidistant points. However, when we look at the trace of the 
hammer’s axis from —Y direction to +X direction on the surface of the unit sphere, 
we observe that the rotational motion from the source to the destination in three- 
dimensional space is not uniform. 

The “Hammer” example in Fig. 5.8 also presents an interesting aspect of Euler 
angles. Orientation-2 can have an infinite number of Euler angle representations 
given by (y? = X, $5 = 90, 02 = X) where X is any value. Thus between the same 
two orientations, we can have several interpolation paths using Euler angles. As 
an example, the interpolated values obtained using A = —170? give a distinctly 
different and curvilinear path between Orientation-1 and Orientation-2, as shown 
in Fig. 5.11. 


5.6.2 Axis-Angle Interpolation 


The equivalent angles and axes of rotation for both Orientation-1 and Orientation-2 
can be computed from the corresponding transformation matrices using Eq. 5.41. 
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Fig. 5.11 Interpolation sequence generated using Euler angles (90, —90, 0) and (—170, 90, —170) 
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Fig. 5.12 Interpolation sequence generated using the angle-axis transformation 


190 


The parameters for Orientation-1 are 6; = 120°, lı = 0.57735, m; = —0.57735, 
nı = 0.57735, and for Orientation-2 the values are 62 = 90°, h = 0, m = 1, n5 — 0. 
A straightforward linear interpolation gives 


6 = (1 — t); + td 
1 —(1—0H tl; 
m = (1— tym; ^ tm; 


an= (=j tma O<t 1. (5.60) 


The interpolated vector will need to be normalized before constructing the 
transformation matrix in Eq. 5.38. The intermediate orientations generated using 
the above equation are shown in Fig. 5.12. 

In the example shown above, the angle axis transformation generates a non- 
uniform motion with a large variation in the angular velocity. As can be seen 
from both the œ -graph and the trace on the sphere, the density of points around 
the source and the destination points is very large compared to the middle. The 
parameters used in the interpolation belong to completely different domains, the 
angle being a scalar and the axis of rotation being a vector. Quaternions help us to 
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Fig. 5.13 Interpolated and 
unit quaternions on a unit Q» 
sphere in quaternion space É 


Qı 


@ Interpolated quaternion O Unit quaternion 


eliminate this disparity in the type of the interpolants, and achieve a rotation where 
both the axis of rotation as well as the rate of change of angle remain constant. In the 
next section, we consider a linear interpolation using quaternions. 


5.6.3 Quaternion Linear Interpolation (LERP) 


Given two unit quaternions Q; = (qo, qi, go, q3P} and Q5 = (qo, qi, 
qo, qa}, a linear interpolation gives the quaternion 


QO=(1-t)OQ;+tQ.,, OK<t<1. (5.61) 


The quaternion resulting from the above equation is converted to a unit quater- 
nion before a transformation of the form QPQ* is applied to all points P of the 
object. Every unit quaternion lies on a unit sphere in the four-dimensional space 
spanned by the quaternion basis (1, i, j, k). The interpolated quaternions obtained 
from Eq. 5.61 lie on a straight line between the two points Q; and Q». Converting 
them to unit quaternions moves each interpolated quaternion to the surface of the 
sphere along a radial (Fig. 5.13), resulting in an uneven distribution of points and 
a corresponding non-uniformity in the angular velocity of the object. The speed in 
the middle of the interpolation path is generally much higher than the speed at the 
end points. The interpolated quaternions after normalization lie on an arc of a great 
circle between Q; and Q». 

Continuing with our “Hammer” example in Fig. 5.8, the source and the target 
orientations in Table 5.1 can be converted into quaternions using Eq. 5.49. For 
Orientation-1, the quaternion parameters are qo? = 0.5, gi“? = 0.5, qa“) = —0.5, 
q3 V? = 0.5, and for Orientation-2, the values are qo = 0.71, q1? — 0, q2 — 0.71, 
q3” — 0. The o/ff-graph and the trace of the hammer axis on the sphere are shown 
in Fig. 5.14. 
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Fig. 5.14 Interpolation sequence generated using quaternion linear interpolation 
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The interpolation path obtained using quaternions is along a circular arc between 
the end points, which is often the most desired path. However, the non-uniform 
spacing of points along the arc indicates that the angular velocity is initially smaller, 
then increases towards the middle and slows down again towards the target. 


Fig. 5.15 Subdivision of the 
angle between unit 
quaternions 


5.6.4 Quaternion Spherical Linear Interpolation (SLERP) 


In the previous section we saw that linear interpolation generates intermediate 
quaternions along a chord between Q; and Q» (Fig. 5.13) on the unit sphere in 
quaternion space. If we subdivide the angle between Q; and Q» uniformly, then we 
will get an even distribution of points on the sphere. Such a distribution will also 
yield a smooth rotation of the object from one orientation to another with nearly 
constant angular velocity. The spherical linear interpolation (SLERP) technique uses 
this approach to compute intermediate quaternions. 

Figure 5.15 shows the geometrical constructions needed to derive the SLERP 
formula. In the figure, Q1 = (qo(9, qi, qo, qg} and Q2 = (qu), qi, qr, 
q3 are any two unit quaternions and P is another unit quaternion that is orthogonal 
to Qj. Treating them as vectors in quaternion space, Q2 — Qicos2 is a vector 
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Fig. 5.16 Interpolation sequence generated using quaternion spherical linear interpolation 


(denoted by R) from Q,’ (the projection of Q» on Q1) to Q2, where 2 is the angle 
between Q; and Q2. 2 is computed from the following formula: 


cos Q = gogo + gig? ES q; 04,0 4 q30940 (5.62) 


Dividing R by its magnitude (sin{2), we get the unit quaternion in the direction 
of R. Thus 


_ Q2— Qicos 2 
E sin 2 


P (5.63) 


Figure 5.15 shows the angle between Q; and Q» subdivided using an interpola- 
tion parameter f (0 < t < 1), and the interpolated unit quaternion Q generated using 
this subdivision. Resolving Q along the orthogonal unit directions of Q; and P 
we get 


Q = Qijcos(t 42) + Psin(t£2) (5.64) 
Substituting Eq. 5.63 and simplifying we get 


Qi sin ((1 — £)42) + Q2 sin (t2) 


g= sin (2) Re} 


The above equation has a singularity when (2 =0 or +180°. When 2 — 0, 
both the initial and final quaternions are the same, and therefore no interpolation 
is necessary. When (2 = +180°, Q) = —Q,. From Eq. 5.57 we know that this 
condition also corresponds to the situation where both orientations are the same. 

The interpolated sequence generated by Eq. 5.65 for the “Hammer” example is 
shown in Fig. 5.16. Compared with the results obtained from previously discussed 
forms of interpolation, the smoothness of the interpolating curves as well as the uni- 
formity in the distribution of points along them are noticeable. Spherical linear inter- 
polation yields an optimal angle interpolation between two orientations with a con- 
stant axis of rotation. If the interpolation parameter is incremented in constant steps, 
spherical linear interpolation will generate a motion with constant angular velocity. 
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Fig. 5.17 Two different a b 
interpolation paths on the Qi Qi 
quaternion sphere 


Q) 


When interpolating between two quaternions Q; and Q2, we have to make sure 
that the interpolation is performed along the shorter arc on the great circle through 
the two points on the quaternion sphere. If the angle (2 between Qı and Q» is 
less than 90°, we interpolate between the two quaternions (Fig. 5.17), otherwise 
we interpolate between Q, and —Q» (Fig. 5.17). In other words, if Q1° Q2 = cosQ 
< 0, we negate the sign of Q2. The value of cos{2 is computed using the formula in 
Eq. 5.62. 

The following sections discuss a few more applications of quaternions for 
representing transformations in a three-dimensional space. 


5.7 Quaternion Exponentiation 


We will extend the notion of exponentiation from the field of complex numbers to 
the domain of quaternions and also define the associated logarithmic function that is 
consistent with exponentiation. However, there are some subtle differences between 
the way in which these operations are performed on real and complex numbers and 
the way they are applied to quaternions. 

From Eq. 5.6 we know that a unit complex number can be expressed as z — (cosó, 
sinó). The same complex number has an alternate representation in the form z — e?. 
This is the well known Euler's formula in complex numbers. We know that a unit 
quaternion can also be written as Q — (cosó, u sinó). Similar to complex numbers, 


an exponential notation for unit quaternions can be introduced as follows: 
Q = (cosó,u sind) = e? (5.66) 


where u —(l, m, n) is a unit vector in three-dimensional space. For the time 
being, we will treat the above equation as only an alternate representation of unit 
quaternions. We will see the formal definition of the exponential function and how 
itis related to the above notation immediately after the next equation. The logarithm 
of the unit quaternion in Eq. 5.66 is defined as 


QL = log(Q) = log(e”’) = (0,u8) = (0,18, mó, nó) (5.67) 
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Q; is a pure quaternion and its magnitude is ô. The definition of an exponential 
function for quaternions must be consistent with the above operation and the 
inverse of the log() function, such that exp(log(Q)) = Q. We thus have the following 
definition: 


exp(Q;) = exp((0, u)) = (cosó,u sind) = e’, (5.68) 


The above definition leads to the following important result for any unit 
quaternion Q = (cosó, u sind), and any real value t: 


Q' = exp(tlog(Q)) = exp((0, utó)) = (cos(tó), u sin(tó)) (5.69) 


Note that the operations Q' and exp(Qj;) both return unit quaternions. As a special 
case, when f = 0, we have 


Q° = (1,0,0,0) for any unit quaternion Q. (5.70) 


Since quaternion multiplication is non-commutative, it immediately follows that 
Q" Q^ + Q^Q^ and, log(PQ) Æ log(P) + log(Q). However, the following equations 
are valid for all unit quaternions Q: 


Q^o* - prs 
(09 = Q” (5.71) 


We know that the unit quaternion Q given in Eq. 5.66 represents a rotation by 
an angle 26 about the unit vector u passing through the origin. From Eq. 5.69, we 
see that raising Q to the power of t effectively changes the angle of rotation. Thus if 
0 xtX 1, then Q' gives a unit quaternion that represents a partial rotation 210. This 
result is useful for interpolating between orientations. In the next section, we will 
define the relative quaternion between two orientations, and then apply Eq. 5.69 to 
perform incremental rotations along a path from the source orientation to the target 
orientation. As a result, we will get another equation for the quaternion spherical 
linear interpolation using the exponential notation. 


5.8 Relative Quaternions 


In Sect. 5.6, we defined the three-dimensional orientation of an object using the 
parameters of rotation that transforms the object from its initial configuration to the 
current. This rotation can be represented by a unit quaternion. Thus two independent 
orientations of an object can be represented by two unit quaternions Q; and Q2 
(Fig. 5.18). In the following, we try to find the relative quaternion that performs 
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Orientation-2 


| 


Initial Configuration 


Orientation- 1 ===» 


Relative quaternion 


Qi 


Ad 


Fig. 5.18 The relative quaternion transforms an object from one orientation to another 


a rotation from the first orientation to the second. This relative quaternion can 
be easily obtained by noting how Q; and Q» transform points from one frame to 
another. 

In Fig. 5.18, the point P, in Orientation-1 corresponds to the point P in the initial 
configuration. In other words, the quaternion Q; transforms P into P,. Similarly the 
quaternion Q» transforms P into P5 in Orientation-2. Therefore, 


Pi = QP Q,“ 
P, = Q;PQY (5.72) 


Now we seek a quaternion that transforms P, into P2. From the first equation 
above, we get the inverse transformation, 


P = Q;* PQ: (5.73) 
Substituting in the second equation, we have 


P) = Q;:0i* P1 Q1 Q2” 
= (Q201*)Pı (Q2 Q1*)* (5.74) 


The above equation shows that the quaternion Q2Q,* transforms the point P, into 
P2, and therefore represents the transformation from Orientation-1 to Orientation-2. 
Note that Q5Qi* is a unit quaternion. Q2Q,* is called the relative quaternion 
between Q; and Q». 

We now revisit the problem of interpolating between Orientation-1 and 
Orientation-2. Any intermediate orientation in the above example can be obtained 
by first applying the unit quaternion Q; to get to Orientation-1 from the initial 
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Listing 5.1 Pseudo-code for computing SLERP equation in exponent form 


Inputs: Qi, Qs 
1. Compute the product S = Qo(Qi*) 


2. Compute the angle of rotation 6 from S using Eq. 5.45 
3. Compute the axis of rotation u from S using Eq. 5.46 
4. For t varying uniformly within the interval [0, 1] 
4.1 Form the quaternion S, = (cos(t6/2), u sin(tó/2)) 
4.2 Compute the product T = S,Q, 

4.3 Transform every point P using the quaternion T 
4.4 Render the object 

4.5 End 


configuration, and then applying a partial rotation using the relative quaternion 
Q2Q,*. From Eq. 5.69, we know that this partial rotation can be effected by 
(Q2Q1*)', where, 0 « t «1. Combining the two transformations together, we get 
the quaternion (Qo2Q:;*)'Qi. By varying t uniformly between O and 1, we get 
the quaternions that interpolate between the two orientations. What we have just 
obtained is another form for the quaternion spherical interpolation (SLERP) formula 
using the exponent function. The pseudo-code in Listing 5.1 outlines this method 
for rotation interpolation. When t= 0, (Q5Q;*)'Q; becomes Q1, and when r= 1, 
the interpolated quaternion becomes identical to Q». 


5.9 Dual Quaternions 


In previous sections we saw applications of unit quaternions in representing 
rotational transformations. Dual quaternions generalize the notion of quaternions 
to an 8-tuple, and provide a convenient tool for representing rigid body transfor- 
mations containing both rotations and translations in three-dimensional space. The 
mathematical structure of dual quaternions uses two quaternions that are combined 
using the algebra of dual numbers. Before considering the theoretical aspects of dual 
quaternions, we will look at the definition and properties of dual numbers. 


5.9.1 Dual Numbers 


The structure and the algebra of dual numbers are very similar to complex numbers. 
Given two real numbers a and b, a dual number can be written as a+ € b, where 
€? — 0. The number a is then called the real part, and b the dual part. £ is often 
referred to as the dual unit. As in the case of complex numbers, we can use a tuple 
notation d — (a, b) to represent a dual number. The algebra of dual numbers satisfies 
the following rules for addition and multiplication: 


(a1, b1) X (a2, b2) = (a1 x ao, bi x b2) (5.75) 
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(ai, bi) (a2, b2) = (a1a2, ayb2 + a2b1) (5.76) 
c(a, b) = (ca, cb), for any real number c. (5.77) 


Using the multiplication rule in Eq. 5.76 we find that 


(a,b) (=. =) = (1; 0) (5.78) 
a a 


Therefore, the second term in the product above is the multiplicative inverse of 
(a, b), provided a Z 0. The conjugate of a dual number d = (a, b) is defined in a way 
similar to that of a complex number: 


d* — (a,—b) (5.79) 


Using Eq. 5.76, it can be verified that dd* = a°. We also note that (a, b? = (a?, 
2ab). Hence, 


b 2 

a, ——] =(a, b 5.80 

lad cu sao 

The above equation directly leads to the definition of the square-root of a dual 
number: 


— b 
va, b) = (va. zx) (5.81) 


In the next section, we will extend the concepts introduced above to the algebra 
of dual quaternions. For notational convenience, dual numbers will often be written 
as (a, à). 


5.9.2 Algebra of Dual Quaternions 


A dual quaternion is a quaternion constructed using dual numbers as its components: 
Q = (qo, q1, q2, q3), where qi = (qi, di), i = 0,...3. Equivalently, we can also define 
a dual quaternion as a dual number whose components are quaternions: Q = (Q, Q) 
where Q = (qo, qi, q2, q3), and Ó = (ĝo, d1. Q2. q3). Q is a pure dual quaternion if 
qo = 0, or equivalently if qo = ĝo = 0. We can also represent any dual quaternion 
Q as an 8-tuple (qo, 41, 92, q3, qo. qı, q2, d3). The following representation of Q 
reveals the products of quaternion units and the dual units that are associated with 
each component of the 8-tuple. 


Q = qo + iqı + jq2 + kq3 + edo + £i di - €] dao - &k qa (5.82) 
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Table 5.2 Multiplication table for dual quaternion units 


1 i J k E ci Ej ek 
1 1 i J k E ci £j ek 
i i —1 k —j él —é ek —&j 
j j —k —1 i £j —ek —é él 
k k J —i —1 ek £j —ei —é 
E E él £j ek 0 0 0 0 
ci £i —é ek —&j 0 0 0 0 
Ej £j —ek —é £i 0 0 0 0 
ek ek £j —&i m 0 0 0 0 


The following dual quaternions form a mutually orthogonal set of basis vectors 
for the entire 8-dimensional space of dual quaternions. 
ig = 1 = (1,0,0,0,0,0,0,0) 
i; =i = (0,1,0,0,0,0, 0, 0) 
i2 = j = (0,0, 1,0, 0,0, 0,0) 
i; =k = (0,0,0, 1,0, 0,0, 0) 
i, =e = (0,0,0,0, 1,0, 0,0) 
is = £i = (0,0,0,0, 0, 1,0, 0) 
= ej = (0,0,0,0,0,0, 1,0) 
i; = ek = (0,0,0,0,0,0,0, 1) (5.83) 


| 


Any dual quaternion is a linear combination of the above basis vectors: 


, 
O= irq (5.84) 
k=0 


Using the multiplication rule for quaternion basis, we observe that i(ej) = ek, 
(ek)j = —ei, (ei) (ej) = 0 etc. Note also that ei = ie. The complete multiplication table 
is given in Table 5.2. The multiplication rule for dual numbers given in Eq. 5.76 can 
be extended to quaternions: 


PQ = (P, P) (Q, Õ) =(P Q, PO+PQ) (5.85) 


We can also multiply a dual quaternion P by a quaternion Q: 


PO =(P, P)O=( Q, PO) (5.86) 


The conjugate of a dual quaternion is defined in three different ways, as discussed 
below. 
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Conjugate type 1: As mentioned in the beginning of this section, we can treat a 
dual quaternion Q as a quaternion with dual number components (qo, q1. 42, q3). 
Applying the rule for a quaternion conjugate, we get Q* = (qo, —q1, —q». —q3). 
hence 


Q* = (O*, Q*) = (qo. —41. —42. —43; do. —di. —d2. —d3) (5.87) 
This definition satisfies the following property: 
20" =(20*,00*+ 00") 
= (qo? +q? +q +43, 2(qodo + didi + Oh + 43d3)) 
= Q*Q (5.88) 


In the above derivation, note that QQ* =(\Q|’, 2 Q*Q), where * indicates 
the dot product between the two quaternions. It can also be easily verified that 
(PQ)* — Q*P*, for any two dual quaternions P, Q. This property is useful for 
combining two or more successive transformations (see Eq. 5.51). The norm of a 
dual quaternion can now be defined as follows: 


all = YOO" = (10, 210+ d)) = (io see (5.89) 

The above derivation is based on the definition of the square-root of a dual 

number as given in Eq. 5.81. A unit dual quaternion Q satisfies the condition 

||Q|| = 1. From Eq. 5.89, we see that Q = (Q, Q) is a unit dual quaternion if and 

only if |Q| = 1 (i.e., Q is a unit quaternion) and Qe O= 0 (i.e., Q is orthogonal to Ó 
in quaternion space, or Ó- 0). 


Conjugate type 2: If we treat Q as a dual number (Q, Q), then the application of 
the rule in Eq. 5.79 gives the following definition: 


Q* = (Q.—Q) = (qo. qi. q2, q3, —ğo, —91, —42, —d3) (5.90) 


The main drawback of the above definition is that it does not lead to a convenient 
definition for the unit norm. Further, it does not satisfy the condition (PQ)* = Q*P*. 


Conjugate type 3: Here we combine both the above definitions to form a new type 
of conjugate as given below: 


Q* = (Q*. —Q") = (qo. —41. —42. —43 — do. di. 2. d3) (5.91) 


The above definition satisfies the properties (PQ)* — Q*P*, and (Q*)* — Q. The 
norm in this case is defined as 
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= /00* = 50* —(60*)") = Qo" -(Qo*y 
lei = v22 = (ior. ĝo 2e») - (ie Io ) 


(5.92) 


With the above norm, a unit dual quaternion Q must have a unit quaternion Q 
for its real part, and QQ* must be a real quaternion. In the next section, we will 
use the above definition (type 3) of the conjugate to construct dual quaternions that 
represent rigid body transformations. 


5.9.3 Transformations Using Dual Quaternions 


Recall that any unit quaternion Q can be used to perform a rotational transformation 
of a vector p — (x, y, z) in three-dimensional space using the quaternion product 
QPQ* where P is the quaternion (0, p). We can also represent the vector p by the 
dual quaternion P — (1, P) — (1, 0, 0, 0, 0, x, y, z). P is a unit dual quaternion. 
Similarly, if Q is a unit quaternion, then Q — (Q, 0) is a unit dual quaternion. Then 


QPQ* = (Q.0)(1, P(Q*.0) = (Q, OP)O* = (1, OPQ*) 
= (1, P) = P’ (5.93) 


where P’ is the quaternion (0, p’) that represents the transformed (rotated) vector. 
The above result is valid for all types of dual quaternion conjugates described in the 
previous section. It shows that for every unit quaternion there exists a corresponding 
unit dual quaternion that performs exactly the same rotational transformation of 
vectors. We now ask the question: does such a transformation exist for translations 
in three-dimensional space? 

Given a translation vector t = (t1, t2, t3), let us construct a quaternion T in the 
form (0, ¢/2), and from it, a dual quaternion T as 


lj to f 
T =(1, T)=(1,0,0,0,0, 4, 2, 2 (5.94) 
2 93 9 


Note the division of the vector components by 2 in T, similar to that of a rotation 
angle in a unit quaternion (see Eq. 5.44). Using conjugate type 3, 


T*=(1,-T*)=(,7T)=T (5.95) 
Applying a transformation of P using T similar to Eq. 5.93, 
TPT* = (LT), P), T) = (1, P + 2T) = P’ (5.96) 


The above equation shows that a point p = (x, y, z) gets transformed into the 
point p! — (x - fj, y+ h, z+ t3) if p was embedded in a quaternion as P = (0, p), 


5.10 Summary 109 


the quaternion itself embedded in a dual quaternion P as (1, P). Thus we can 
use T as a dual quaternion representing spatial translations. We will now use the 
above results to construct a dual quaternion that represents the most general rigid 
body transformation: a rotation by an angle 6 about an arbitrary vector (l, m, n) 
through the origin, followed by a displacement by a translation vector (f, t2, t3). Let 
Q — (Q, 0), T= (1, T) represent rotation and translation respectively. The composite 
transformation is then represented by the dual quaternion G — (Q, TQ) as seen in the 
following derivation: 


GPG* = (Q,TQ)(1, P)(Q, TQ)" = (Q, OP + TQ)(Q*,-Q* T?) 


= (QQ*. OPQ" + TOQ* - QQ'T") = (1, OPQ" + 2T) = P' 
(5.97) 


The quaternion QPQ*+ 2T gives the transformed point after the required 
rotation and translation. 


5.10 Summary 


This chapter gave an overview of the quaternion algebra including the properties 
that are useful for graphics applications. Unit quaternions represent rotations about 
the origin. Composite rotations can be represented by a product of quaternions. 
The multiplicative inverse of a unit quaternion is the same as its conjugate. A unit 
quaternion with all of its components negated represents the same orientation as the 
original quaternion. 

Computer graphics animations generally involve several rotation interpolations. 
This chapter compared the effects produced by Euler angle interpolation, axis- 
angle interpolation and quaternion interpolation. The spherical linear interpolation 
of rotations using unit quaternions produced optimal rotation with uniform angular 
velocity. Methods for visualising three-dimensional rotation sequences were dis- 
cussed. 

This chapter also presented the algebra of dual quaternions which has recently 
found applications in graphics. Dual quaternions are defined based on the concept of 
dual numbers, and they can be viewed as 8-dimensional vectors. The conjugate of a 
dual quaternion can be defined in three different ways. The property of dual numbers 
that is important from the point of view of computer graphics is that the most general 
rigid-body transformation in three-dimensional space can be represented by unit 
dual quaternions. 

The next chapter further analyses three-dimensional motion using forward and 
inverse kinematics equations. In this chapter, we will revisit quaternion representa- 
tion of rotations to define angular velocity components of motion. 
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5.11 Supplementary Material for Chap. 5 


The folder Chapter5/Code on the companion website contains the definition 
and implementation files for both the quaternion and the dual quaternion classes. It 
also contains the following programs demonstrating the effects of different types of 
interpolation methods on rotational transforms. 


1. Quaternion.cpp 


Additional files: 


Quaternion.h 


class 
it3.h Quaternion 
E ( 


. CPT 


The quaternion class defines methods for performing quaternion operations, 
and representing three-dimensional rotations using quaternions. The class also 
has methods for both linear and spherical linear interpolation of rotations using 
quaternions. The class documentation can be found in Appendix D. 


2. DualQuat.cpp 


Additional files: 


ualQuat 


This class is used for the construction of dual quaternions and for performing 
basic operations and transformations using them. The class documentation can 
be found in Appendix D. 


3. EulerInterp.cpp 


Ria files: 


tga.cpp 
xn.tga, xp.tga 


yn.tga, yp.tga 
zn.tga, zp.tga 


The program displays a texture mapped cube with its orientation clearly 
shown using the markings of initial direction on each face. For a given set of 
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initial and final orientations specified using Euler angles, the program generates 
the display of ten intermediate orientations using Euler angle interpolation. 


4. RotationInterpl.cpp 


Additional files: 


The program uses the object model in Fig. 5.8, and two orientations as given 
in Table 5.1 to compare the paths taken by Euler, quaternion and angle-axis 
interpolations. Pressing key ‘1’ selects Euler, ‘2’ angle-axis, and ‘3’ quaternion 
interpolation. Pressing space bar shows the motion of the object through the 
interpolated sequence. 


5. RotationInterp2.cpp 


Additional files 


Quaternion.h 


Quaternion.cpp 


The program displays an interpolation sequence using triangles placed on a 
sphere. Different parts of the sphere can be viewed by rotating it using the arrow 
keys. The initial, final and the interpolated values are also displayed in text form. 
Pressing key | selects Euler interpolation, key 2 selects angle-axis interpolation, 
and key 3 selects quaternion interpolation. 


5.12 Bibliographical Notes 


The algebra of quaternions was first discovered by the Irish mathematician Sir 
William Rowan Hamilton (1805-1865). Most of his work on the quaternion group 
were later published as a book (Hamilton and Joly 1899). A detailed description 
of the quaternion algebra including definitions, properties and proofs of theorems 
are given in Kuipers (1999) and Hanson (2006). An in-depth theoretical analysis of 
the quaternion group, associative algebras and higher dimensional structures can be 
found in Conway and Smith (2003), Ward (1997), Kamberov (2002). 
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Shoemake's paper (Shoemake 1985) established the effectiveness of quaternions 
as a powerful mathematical tool in graphics applications. Several books on computer 
graphics such as Eberly (2007), Foley (1996), Watt and Policarpo (2003) describe 
the applications of quaternions in rotational transformations of objects. 

One of the early publications containing references to dual numbers and dual 
quaternions highlighting their importance in kinematics is Bottema and Roth (1979). 
However, itis a more recent publication by Ladislav Kavan et al. (2007) that showed 
that dual quaternions could indeed be used in computer graphics, particularly in the 
area of vertex skinning, for representing rotations combined with displacements. 
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Chapter 6 
Kinematics 


Overview 


The term “kinematics” refers to the study of the translational and rotational motion 
of objects without reference to mass, force or torque. Kinematics equations are 
used to describe three-dimensional motion of a multi-body system in terms of 
translational and rotational motions, and optionally, linear and angular velocities. 
Kinematics analysis becomes important in the animation of articulated models and 
skeletal structures containing serial chains of joints and links. 

To set the context for developing the kinematics equations for graphics applica- 
tions, we first give an outline of robot manipulators comprising a chain of joints. 
Both forward and inverse kinematics equations of joint chains are then discussed in 
detail. Iterative numerical algorithms for computing joint angles for a given target 
position are also presented. These methods are useful for performing goal-directed 
motion in an animation sequence. 


6.1 Robot Manipulators 


In a system containing several interconnected links, it is often required to find the 
global position of the end-point of the last link. This end-point is called the end 
effector. In an animated character model that performs a certain task, this could 
be the tip of a finger. In a robot manipulator, knowing the end effector position 
is important to carry out tasks such as inspection, picking, welding, painting, etc. 
Robot manipulators usually contain many links and different types of joints. In such 
systems, the motion of the end effector becomes exceedingly complex, as it depends 
on many joint parameters. 

The Programmable Universal Machine for Assembly (PUMA) is a classic 
example of a robot manipulator arm. A graphics model of the PUMA robot is 
shown in Fig. 6.1. It consists of a chain of links and joints, with the end effector 
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Fig. 6.1 A graphics model of a PUMA robot 


T = A Y 


Revolute Prismatic Hooke’s Spherical 
Joint Joint Joint Joint 


Fig. 6.2 Commonly used joints in robot manipulator arms 


or the gripping device forming the last link. The other end of the joint chain is 
fixed to the base. This link forms the root of the tree that represents the hierarchy of 
transformations applied to the links. This hierarchical structure is the same as that of 
the scene graph we saw in Chap. 3. The transformations depend on the rotation and 
displacement of each link relative to its parent. The joint types as well as physical 
mounting constraints dictate the degrees of freedom of a particular configuration. 
The range of allowable angular and linear displacements at a joint also depends on 
the joint type. 

Several types of joints can be found in robot manipulators. The most common 
is the revolute joint that is used for a simple rotation of a link about a fixed axis, 
providing one degree of freedom. A prismatic joint, on the other hand allows a 
translation or displacement of a link with respect to its parent. Compound rotations 
about two orthogonal axes can be performed using a Cardan joint or a Hooke’ s 
joint. A Hooke’s joint can be modelled by two revolute joints whose axes intersect. 
A more sophisticated type of joint providing three axes rotation is the spherical joint, 
also known as the ball and socket joint. Sample illustrations of these joints are given 
in Fig. 6.2. 

For graphics applications, joint chains with only rotational transformations 
are commonly used. Some examples of such systems were given earlier in 
Chaps. 3 and 4. Generalised rotations with multiple degrees of freedom can be 
easily modelled using either Euler angles or quaternions as described in the previous 
chapter. In the next section, we consider the problem of finding the global position 
of the end effector, given the joint angle parameters. 
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6.2 Forward Kinematics 


The term forward kinematics refers to the movement of a joint chain, given all the 
information about the relative position and orientation of each link with respect to 
its parent, and absolute position of the root joint. Forward kinematics equations are 
used to determine the position of the end effector in the world coordinate system for 
a given set of joint angles. 


6.2.1 Joint Chain in Two Dimensions 


Consider a 3-link chain shown in Fig. 6.3, that is constrained to move on a two- 
dimensional xy-plane. Assume that the absolute position of the base link is specified 
by the point A = (xa, Ya), and that the link lengths di, d», d3, and the joint angles 01, 
05, 05 are given. These parameters completely specify the configuration of the joint 
chain. Note that the joint angles are defined relative to the parent link. Using this 
information, we seek the coordinates of the end effector E. 

For a planar motion, the angles are simply summed up from the base link to the 
current link to find the absolute orientation of that link. In the example given above, 
angles 01, 02 are positive while 05 is negative. The coordinates of the points B, C, 
E can be computed in a sequence starting from the base as follows: 


Xp = Xa + di cos(01) 
Yb = Ya + dı sin(01) 
Xe = Xp + dy cos(O; + 02) 
Ye = Yo + d» sin(O, + 02) 
Xe = Xe + d3 cos(04 + 05 + 03) 
= d3cos(9; + 05 + 63) + docos(04 + 62) + dicos(01) + Xa 
Ye = yc + d3 sin(0; + 05 + 03) 
= d3sin(O; + 05 + 03) + dosin(0; + 05) + disin(0;) + Ya (6.1) 


Fig. 6.3 A planar motion of 
a three-link joint chain 
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Fig. 6.4 A 4-link joint chain 
in three-dimensional space 


The above sequence can be extended to any number of links and joint angles: 


Xn = Xn—1 + dcos » a) 


i=l 


Yn = Yn-j F d,sin (Y: a) (6.2) 


i=1 


where (Xn, Yn) is the position of the nth link, and d, its length. If n is the index 
of the last link containing the end effector, then its direction is given by the unit 


vector(cos(? 7; ., 6;), sin(Y 5; ., 9;)). 


6.2.2 Joint Chain in 3D Space 


In a three-dimensional coordinate system, we should be able to apply the most 
general rotational transformation to every link of the joint chain. We can then 
simulate the movement of links connected by a revolute joint, a Hooke's joint, or a 
spherical joint. In order to define the relative orientation of a link with respect to its 
parent, we will need to define an orthogonal right-handed body-fixed frame on each 
link. 

Consider a 4-link joint chain shown in Fig. 6.4. A link 7 has a body-fixed frame 
(u;i, vi, wi) and a length d;. Every link is assumed to be aligned along the x-direction 
in its frame, given by the u; axis. The rotation of the link i is defined by the relative 
orientation of the frame (u;, vi, w;) with respect to its parent's frame (uj), vii, 
Wii). This is specified by a 3 x3 rotation matrix R;. The rotation matrix can 
be formed using any representation of generalized rotations such as Euler angles, 
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angle-axis parameters, or quaternions. The end effector is denoted by the point E. 
The position of link i is indicated by the point P;. The forward kinematics solution 
for this joint chain attempts to find the coordinates (xe, ye, Ze) of the point E in the 
world coordinate system, given the position of the base link P, = (x1, yi, zi), lengths 
of the links d;..d4, and rotation matrices R,..R.4. 

Note that the matrix R; represents the rotational transformation of the first link’ s 
local frame (u1, vı, w1) in the world coordinate space. Therefore, 


1 0 0 
uy; = R; 0 " y= R; 1 š wi = R, 0]. (6.3) 
0 0 1 


The position of the point P» can be determined as 


X2 di X1 
yo} =Ri] 0} +] vy (6.4) 
Z2 0 ZI 


The matrix R, gives the rotation of the frame (u2, v2, w2) with respect to the 
frame (u1, vı, w1). The position of the point P3 can be obtained in terms of the 
coordinates of P» as follows: 


X3 dz X2 
y4|-RIR2/0 |+|» (6.5) 
Z3 0 £2 


Continuing as above, the coordinates of the end effector E are obtained as shown 
below. 


Xe d4 X4 
Ye = Ri R;R;R; 0 + y4 (6.6) 
Ze 0 Z4 


The above equation can be expanded and expressed in terms of the known 
parameters: 


Xe d4 d3 dy di xi 
Ye | = RRR;$R;| 0 | + Ri RoR3] O | -R;IR2| O | -R1| O | +] yi 
Zs 0 0 0 0 zi 


118 6 Kinematics 


Fig. 6.5 A scene graph based 
representation of the 
transformations applied to the 
links of the joint chain in 

Fig. 6.4 


I 
I 


TR 


The orientation of the frame of the end effector in the world coordinate system 
is given by the product matrix R; R2R3R,. The sequence of derivations given above 
can be extended to form an iterative algorithm for computing the end effector 
position of a general n-link joint chain. 

We can use a scene graph to represent the transformations of the joint chain as 
shown in Fig. 6.5. Using this scene graph model, the coordinates of the end effector 
in the root node's reference frame is given by 


Xe d4 
Ve = T; R; T;R; T5R5T4R,4 0 (6.8) 
Ze 0 


di-4 XI 
T; — 0 , 1=2,3,4,and Ti) =] yi |. (6.9) 
0 Z1 


The equivalence of Eqs. 6.7 and 6.8 can be readily established. 


6.3 Linear and Angular Velocity 


In addition to the position and the orientation, the velocity of the end effector is also 
an important parameter in many applications involving a serial chain. For example, 
an articulated character model may be required to move an object with constant 
velocity. The velocity of the end effector is a combination of the linear velocity of 
the chain itself and the angular velocity introduced by the joint rotations. 
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Fig. 6.6 Velocity vectors v 
on a single link in 
two-dimensional space 


A 


(Xa Ya) 


6.3.1 Velocity in Two Dimensions 


First, we consider a single link AB that can move on the xy-plane, and rotate about 
the point A (Fig. 6.6). The position of the link at any instant is defined by the 
coordinates (Xa, Ya) of the point A. The point B takes the role of the end effector. 
The orientation of the link is measured by the angle 0 made by the link with the 
direction of the x-axis. The linear velocity of the link is the instantaneous speed 
with which it is moved from its current position A. If (Ax, Ay) denote the change 
in the position of the link from A in an infinitesimal interval of time Af, the linear 
velocity components are given by 


past 5» (6.10) 
yg = ELS 1 
ot At At 


The angular velocity w of the link is defined as the instantaneous change in the 
rotation angle 0 : 


=§=li ae 6.11 
DE om 


The direction of angular velocity is perpendicular to the xy-plane. If k is a unit 
vector along the z-direction, the angular velocity vector is 


@ = ok (6.12) 

The linear velocity vg of the point B induced by the above rotation is tangential to 
the circular arc with radius r at B (Fig. 6.6). This velocity is relative to the point A. 
If r denotes the vector from A to B given by (xp — Xa, Yb — ya), then vg is defined as 


the following vector cross product: 


Vo — e x r = (ya — Yb, Xy — xa) Ê (6.13) 
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The total velocity of the end effector B relative to the coordinate frame is simply 
the vector sum 


y = Va + vo (6.14) 


Now consider a three-link joint chain on the xy-plane, shown in Fig. 6.3. We 
define the vectors r1, r2, r3 along the links as follows: 


rı = (dicos0;, dı sin(j) 
r2 = (dzcos(01 + 05), d»sin(0, + 05)) 
r 5 = (dscos(0; + 05 + 03), d3 sin(0; + 05 + 03) (6.15) 


The linear velocity vg of the end effector E induced by the three joint angle 
rotations is given by 


ye = (91 X (rid ra ra3)) + (95 x (ro + ra3)) + (093 Xx r3) (6.16) 


where 9; = 61k, @2= Ok, (93 = Ósk, and k is a unit vector along the z-axis. 
Therefore 


vg = (—d, sin(01) — d» sin(0; + 65) — ds sin(0; + 05 + 03), 
dicos(01) + dacos(0, + 62) + dscos(0, + 05 + 63)) 61 
+ (—d; sin(65) — d; sin(85 + 63), d5cos(05) + d3cos(4 + 03)) 6 
+ (—d3 sin(63), d3cos(63)) 63 (6.17) 


The total velocity of the end effector E is vg + Va where v, is the velocity of the 
chain induced by the translational movement of the base A. As a particular case of 
Eq. 6.13, if p is a vector from A to B that undergoes only a rotational motion about 
A, then the linear velocity of the point B is given by 


p—-oxp (6.18) 


6.3.2 Velocity Under Euler Angle Transformations 


The animation of a general serial chain in a three-dimensional space can be 
performed using Euler angle rotations (see Eq. 5.30) applied at the joints. In an 
extrinsic composition of rotations, the axes of rotation are fixed relative to the joint 
chain. In such a case, if the Euler angle sequence is given by (v, $, 0} as described 
in Sect. 5.4.1, the angular rate vector has the following form: 
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Fig. 6.7 Angular velocity 
vectors on a joint chain in 
three-dimensional space Y 


y 
o=| ¢ (6.19) 
6 


Let us now consider a joint chain that is transformed using Euler angle rotations, 
as shown in Fig. 6.7. 

Each joint P; (i= 1, 2, 3) has a set of Euler angles (v; $; 0i} from which we 
can construct a rotational transformation matrix R; using Eq. 5.30, and an angular 
velocity vector œw; using Eq. 6.19. If d; is the length of ith link, the vectors r; along 
the link directions can be computed as 


di 
ry = R, 0 
0 
d» 
ro = RR: 0 
0 
d; 
r3 =R RR;| 0 (6.20) 
0 


The linear velocity vg of the end effector E resulting from the changes in the Euler 
angles can now be computed using Eq. 6.16. We add this velocity to the translational 
velocity of the joint chain at the base P, to get the total velocity of the end effector 
E with respect to the reference frame. 


6.3.3 Quaternion Velocity 


We know that if P = (0, p) is a pure quaternion, and Q a unit quaternion, then 
the equation P' — QPQ* gives a rotational transformation of the vector p, where 
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P' — (0, p). The quaternion transformation can be viewed as defining the orientation 
of an object where p is a vector specified in a body-fixed frame, and p’ the same 
vector in the fixed (inertial) coordinate reference frame. Differentiating both sides 
and noting that p is a constant vector, 


P’ = QPQ*-QPÓ* (6.21) 


The inverse transformation for P is given by P — Q*P'Q. Substituting this 
expression in the above equation, we get 


P' = QQ*P'QO"* + QQ"P'QO" (6.22) 
Since Q is a unit vector, QO* = 1. Therefore, 
P'— QQ*P'4 P'QO* (6.23) 
Differentiating both sides of the equation QQ* = 1, we also find that 
00*+00* =0 (6.24) 


The above equation shows that QO* 4 (Ò Q*)* = 0. In other words, the real 
part of the quaternion Q Q* is zero. Hence Q Q*can be expressed in the form (0, v). 
With these observations, Eq. 6.23 becomes 


(0. p^) = ©, v)(0. p") — (0, p’)(0, v) (6.25) 
Using the quaternion multiplication rule in Eq. 5.11, we get 
p —2(vx p (6.26) 


Since we consider only rotational motion of the vector p’, its linear velocity is 
given by Eq. 6.18. Comparing both equations, we find that œ = 2v. Hence we can 
write 


DO e (o. z) (6.27) 


where c is the angular rate. Conversely, if a vector is rotated using a unit quaternion 
Q, the angular rate is given by the vector part of the quaternion product 20 Q*. 
Using Eq. 5.13, we can write this relationship in matrix form as given below. 


s E qd q2 Md 
@) E —qi do —4d3 q2 di (6.28) 


| E d3 qo p HM 
C3 —qQ3a —qd2 qı qo d3 
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Note that Q is a quaternion of the type given in Eq. 5.44. Accordingly, Ó takes 
the form 


, 5 ; 
Q = RA l cos —, d n cos JE (6.29) 
2 2 2 2/2 


As expected, the above equations yield the result 


e = (l, m, n) à (6.30) 


6.3.4 The Jacobian 


In general, we can assume that the end effector position E = (xe, Ye, Ze) of an n-link 
joint chain can be expressed as a function of joint angles 0; for i= 1,..,n, (for 
example, see Eq. 6.1). Thus we can write 


Xe = xe(01, 05, "TD On) 

Ye = Ye(0i, 05, nm 0,) 

Ze = Ze (01; 05, <- 04) (6.31) 
If ^0; denotes an infinitesimal change in the joint angles 0; for i= 1,..,n, and 


(Axe, Aye, Aze) the corresponding change in the end effector position during a 
small time interval At, we have, 


Xe + Axe = xe (01 + Ai, Q + ^05, ...,0, + A0,) 
Ye + Aye = ye (01 + Ab, 0» + A05, E 0, + A0,) 
Ze F Aze = Ze(O| + A0, 05 + A05, — On + A0,) (6.32) 


Assuming that joint angle perturbations are small, we can use Taylor's first order 
approximation to express the above set of equations in matrix form as follows: 


OXe OXe OX, Ab, 


90, 30, + 990, 
ni _ | Ae Aye ye | | ^9 (633) 
J«|— ^30, 90; 90, 
Aze Oz, Oz. Oz, 


90, 006, 96, | LAO, 


From the above equation, it follows that 


y», — É — Jó (6.34) 
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Z 


Fig. 6.8 The Jacobian matrix can be constructed using the axis of rotation of each link and a vector 
from that link to the end effector 


where 
6 
F Xe E 05 
E=] ĵe je 0= . (6.35) 
Ze E 
0, 


The 3 x n matrix J is called the Jacobian of the transformation in Eq. 6.31. 

As an example, consider the end effector position of a 3-link chain given in 
Eq. 6.1. The Jacobian in this case is a 2 x 3 matrix containing the partial derivatives 
of x, and y, with respect to the three joint angles. It can be easily verified that the 
expressions for the velocity components obtained using Eq. 6.34 are the same as 
those given in Eq. 6.17. 

The ith column of the Jacobian in Eq. 6.33 can also be obtained using the axis of 
rotation of the ith link and the vector from that link to the end effector. Figure 6.8 
shows an example, where the second link's general rotational transformation has an 
equivalent axis of rotation given by the unit vector u. The vector from the second 
link to the end effector is E — P» denoted by sz. 

The second column of the 3 x 4 Jacobian matrix for the above example can be 
computed using the vector cross product u x s2. Note that s2 — ro 4- r3 +14. 


6.4 Inverse Kinematics 


Inverse kinematics (IK) deals with the process of computing the joint angles, given 
the world coordinates of the end effector. Inverse kinematics solutions are needed 
for animating an articulated figure using only the desired positions of end points as 
inputs. Two examples are shown in Fig. 6.9, where the known end effector position 
is indicated by the point E. 
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Fig. 6.9 Inverse kinematics 

solutions try to find the joint 

angles of a serial chain given 

the position of the end E 
effector E 


Fig. 6.10 (a) Multiple solutions may exist for the inverse kinematics problem for a 2-link chain. 
(b) A solution exists only when the target point is between the inner and outer circles. (c) A simple 
geometric construction used for an inverse kinematics solution 


In the absence of joint angle constraints, multiple solutions may exist for a two 
link joint chain as shown in Fig. 6.10a. On the other hand, a solution may not exist 
for certain other positions of the end effector. In Fig. 6.10b, a solution cannot be 
found if the target position is either inside the inner circle of radius dı — d2, or 
outside the outer circle of radius d, + d2. 

Without loss of generality, we can assume that the base of the joint chain A is 
fixed at the origin. We can also assume that there are no joint angle constraints. In 
the following sections, we will discuss methods for arriving at inverse kinematics 
solutions with these assumptions. 


6.4.1 2-Link Inverse Kinematics 


We can easily develop an analytical solution for the 2-link inverse kinematics 
problem for the configuration shown in Fig. 6.10c. If the coordinates (xe, Ye) of 
the end point E are given, the joint angles 0;, 02 can be determined as follows: 
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Let AE = k. Therefore, k? = xe? + ye”. From triangle ABE we get, 


k? = di? + d? — 2d,d,cos(x — 62) (6.36) 
Hence, 
2 2 2 2 
_ tLy;—-dfí-d 
0, — 1f Xe e 1 2 an 
2 = cos ( odd; (6.37) 
Also, 
_ Ye 
tan($ + 01) = — (6.38) 
Xe 
From triangle AEE’, 
d» sin 0 
(anii) = 2 (6.39) 


dı + də cos 62 


From the previous two equations, we get 


La a NA act d» sin 05 
0, — tan (3) tan (S35) (6.40) 


Equation 6.37 is valid only if 
(di = d>)* < Xe" + y," < (di + do)’ (6.41) 


The above condition corresponds to the situation shown in Fig. 6.1 0b. 


6.4.2 n-Link Inverse Kinematics 


For a general n-link configuration, the problem of estimating the joint angles 
01, 05, ..., On, given only the end effector coordinates (xe, ye, Ze), clearly leads 
to an under-determined system of equations when n > 3. Such a system is called 
a redundant manipulator, implying that more than one set of joint angles could 
possibly lead to the same end effector position. A non-redundant manipulator in 
three-dimensional space contains only three links. 

Suppose we are required to move the end effector from its current position E to a 
desired target location given by T = (x;, y;, z;). The inverse kinematics problem can 
be rephrased as follows: Determine the change in joint angles required to produce a 
change in the end effector position from E to T. If we denote this displacement of 
the end effector by the vector AE = T — E, and the joint angle perturbation vector 
by A9, then from Eq. 6.33 we know that 
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AE — J A0 (6.42) 


where J is the 3x Jacobian matrix. J is invertible only for a non-redundant 
manipulator (n = 3). Generally when n > 3, J is not a square invertible matrix, and 
therefore we cannot directly obtain A0 from the above equation. However, pre- 
multiplying both sides by the transpose JT, we can form a symmetric, square and 
invertible matrix (JTJ), and then obtain a solution for A0 as 


A0 =J AE (6.43) 
where 
Pans (6.44) 


The above matrix is called the left pseudo-inverse of J. For an n-link chain, 
(JTJ) is an n x n matrix. One could use Singular Value Decomposition (SVD) to 
compute the pseudo-inverse of J. If J has a decomposition of the form USVT, then 
the pseudo-inverse of J is given by 


Jt = vs*UT (6.45) 


In the above matrix equation, U is a 3x3 orthogonal matrix, S is a 3x7 
diagonal matrix, and V is a n x n orthogonal matrix. Columns of U are orthonormal 
eigenvectors of JJ‘, and the columns of V are orthonormal eigenvectors of JTJ. The 
matrix S contains square-roots of eigenvalues of either JJ" or JTJ. Its inverse S^ 
can be readily obtained by transposing S and taking the reciprocals of the diagonal 
elements. Denoting the columns of U by vectors u; (i — 1..3), and the columns of V 
by vectors v; (i= 1..n), we have 


pl 
Ja 0 0 -0 
J=[mwmu]| 0 va 0 0|? (6.46) 
0 0 yo -0]| r 
and 
00 
EN 
0 Ju ur 
1 
J'2[n»-«»]|] 9 9? Fae] | uf (6.47) 
T 
u3 
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where o; denotes the ith eigenvalue of the square matrix JJ", and o, > 05 > 05. 
Substituting the above expression in Eq. 6.43 and simplifying, 


A0, Fi 
A0» 1 1 1 ibs 

= E yar? «n | ul yr — Ye (6.48) 
^6, x LUE 


Note that the sizes of the three matrices on the right-hand side of the above 
equation are n x 3, 3x3, and 3x 1 respectively. In the following sections, we 
discuss iterative numerical methods that try to move the end effector through a 
sequence of points to the desired target position. 


6.5 Gradient Descent 


The inverse kinematics solution for computing A0 as outlined in the previous 
section is based on an important assumption that both AE (the distance from 
the current end effector position to the target) and A0 (joint angle perturbations) 
are small. Many practical situations violate these conditions. The two-dimensional 
analogue of the situation where the distance between the end effector and target is 
large is shown in Fig. 6.11a. The y-axis represents the end effector position whose 
dependency on the joint angle 0 is given by the function y = f(0). The desired target 
position is indicated by the ordinate T. 


a y= KO) 


Fig. 6.11 (a) Computing A0 from the derivative alone can lead to significant errors if AE is large. 
(b) The iterative convergence of the gradient descent algorithm 
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Listing 6.1 Pseudo code for the gradient descent algorithm 


1. Input: À //A value between 0 and 1. 

2. Input: T //Target position 

3. Input: & //Error threshold 

4. Input:  kmax //Maximum number of iterations 
5. Input 8, //Initial joint angles 

6. k= //iteration number 

7. Compute E(0,) //End effector position 

8. IF (|T-E|<e)STOP //Reached target 

9. Compute J //Jacobian 

10. Compute J* //Jacobian pseudo inverse 

11. Compute A6, //Using Eq. (6.51) 

12. k = k+1 //Increment iteration count 

13. Compute 6, //Upadate equation in Eq. (6.51) 
14. IF (k>kmax)STOP //Maximum iterations exceeded 


15. GOTO Step 7 


The solution given in Eq. 6.43 is equivalent to computing A0 in the above 
example using the formula 


AE 
à2-———— (6.49) 


(2$) 
d 


As can be seen from Fig. 6.11a, there is a large error in the value obtained for 
A0, the solution giving only a fraction of the required change in 0 given by the 
distance AB. If AE is large, we will need to approach the target in smaller steps. 
This is done by scaling AE by a factor X (0 « X < 1), each time updating the end 
effector position and the derivative. This approach is called the gradient descent 
method, and is shown in Fig. 6.11b. The following equation computes the value of 
incremental changes in 0 for each iteration step k, and updates the function value 
and its derivative. 


_ A(T- f0) 
em), 


We can employ the gradient descent method for iteratively computing A6 after 
introducing the scaling factor X in Eq. 6.43. The modified equation is given below. 


AO, , Ok+1 = O + AK (6.50) 


A0,—AJ,* (T — Ek), 94-+1= 0, + AO, (6.51) 


where 0, is a column vector of joint angles updated in the kth iteration. The gradient 
descent algorithm for computing the joint angles for a n-link chain is given in 
Listing 6.1. 
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6.6 Cyclic Coordinate Descent 


The Cyclic Coordinate Descent (CCD) algorithm is a well-known method used 
for inverse kinematics solutions in computer graphics applications involving joint 
chains and moving targets. CCD performs a series of rotations on the links of a joint 
chain, starting with the last link, each time trying to move the end effector closer to 
the target. 

A sequence of rotations performed by the CCD algorithm for a 4-link chain on 
a two-dimensional plane is shown in Fig. 6.12. The joints of the links are denoted 
by Pj, P2 ... etc., the target by T, and the end effector position by E. The last 
link is rotated first by an angle 04 about P4, where 04 is the angle between end 


Pi P; 


Fig. 6.12 Sequence of rotations performed by CCD algorithm on a 4-link joint chain 
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Listing 6.2 Pseudo code for the CCD algorithm 
Inputs: Pira Pur Tr Epi 


2 Input: € //Error threshold 

3 Input:  kmax //Maximum number of iterations 

4. isn //Link index. Start from last link 
5. kel //Iteration count 

6 Compute ui, v; //Nectors E-P;, T-P; 

7 Compute Q, //Using Eq. (6.52) 

8 Compute 8, //Using Eq. (6.53) 


9. Perform angle-axis rotation (6;, @;,) of link i 
10. Compute the new position of E 

11. IF(|T-E|<s)STOP //Reached target 

12. i = i-1 //Next link 

13. IF(i«1) THEN {k = k+l; i =n} //Start again 
14. IF(k>kmax) STOP //Maximum iterations exceeded 
15. GOTO Step 6 


effector vector u4 = E — P4 and the target vector v4 = T — P4 (Fig. 6.12a). This 
rotation brings the end effector E to a point on the target vector. The second rotation 
is performed about the next link position P3, by an angle 63 between the end 
effector and target vectors at that point (Fig. 6.12b). This process of rotating links 
is continued till the first link P, is reached (Fig. 6.12d), and then repeated over, 
starting again from the last link P4 (Fig. 6.12e). In three-dimensional space, the axis 
of rotation for the ith link at position P; is calculated as 


Ui X Vi 
Qi 


= (6.52) 
ju; x vi| 


where u; = E — Pj, and v; = T — P;. The angle of rotation about the unit vector c; is 


dg (E5) (6.53) 


u;| |vi| 


The general algorithm for a n-link joint chain is given in Listing 6.2. 

The terminating condition for the iterative algorithm can be defined based on the 
distance TE between the end effector and the target, and also the number of iterations 
performed. Physical systems using a set of joints, such as robotic manipulator arms, 
have joint angle constraints and other physical limitations that should be taken into 
account while designing an inverse kinematics solution. The CCD algorithm can 
generate large angle rotations that may violate joint angle constraints. In some cases, 
particularly when the target is located close to the base, the CCD algorithm causes 
a chain to form a loop, intersecting itself (Fig. 6.13a). Similarly, for certain target 
positions, the algorithm can take a large number of iterations resulting in a slow 
zigzag motion of the end effector (Fig. 6.13b). The method discussed in the next 
section is designed to overcome these drawbacks. 
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Target 
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Initial 
position 
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Fig. 6.13 (a) Two examples showing entangled configurations of a 10-link joint chain generated 
by the CCD algorithm. (b) The path showing the convergence of the end effector position towards 
a target location 


6.7 Circular Alignment Algorithm 


The circular alignment algorithm tries to place the given joint chain along a circular 
arc between the base and the target position, provided the target is reachable. With 
such a placement of the chain, joint angles will automatically assume values in an 
acceptable range, and there is no possibility of the chain to intersect itself. This 
method has some key advantages over the CCD algorithm: 


1. This algorithm is significantly faster than the CCD algorithm. All joint angles 
have the same value based on a single solution. 

2. The algorithm does not generate large angle rotations. 

3. The algorithm does not generate entangled configurations of chains with large 
number of links. 


The algorithm, however, requires all links to have the same length in order to use 
a simple inverse kinematics solution. The algorithm works on a two-dimensional 
plane containing the base of the link and the target. A general three-dimensional 
problem is thus reduced to two dimensions, assuming that the base link can be 
rotated in such a way that the whole chain is reoriented towards the target with 
all links constrained to move on a single plane. We will first consider the problem 
on the xy-plane, and later discuss how it could be generalized into three dimensions. 

We assume that each link of an n-link chain has length d, and the total length of 
the chain is L = nd. The distance of the target T from the base P of the joint chain is 
denoted by D (Fig. 6.13). If the target is reachable (0 « D « L) then the joints of the 
link can be made to align along a circular path such that the end effector coincides 
with the target. There are two possible scenarios as shown in Fig. 6.13. 

We first compute the angle 6 subtended by the arc P,T, and then derive the radius, 
coordinates of the centre, and joint angle parameters from it. The angle £ is acute if 
the length L of the chain is less than 1D/2, otherwise it is obtuse. In either case, we 
have 
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———— = = (6.54) 
sin (4) D 
We seek the solution of the above equation for 6, by defining the function 
f(B) = d sin B — Dsin L ; (6.55) 
2 2n 


The function has a derivative 


f'(— = (5) eos (5) - (2) 9 (£.) (6.56) 


The solution for 6 can be obtained using Newton-Raphson iteration: 


F (Bx) 
1 = L— P 6.57 
Berti = Êk FG (6.57) 
with the initial condition 
Bo = 2x/n. (6.58) 


The Newton-Raphson method yields fast convergence for the parameter f, from 
which all joint angles can be computed as described below. The radius R of the circle 
and the perpendicular distance S (Fig. 6.15) can be obtained as 


_ D s= D 
E 2sin ($) B 2tan ($) 


Without loss of generality, we can assume that the base of the link P is located 
at the origin of the coordinate system. If the target T has coordinates (x;, y;), the 
centre of the circle is selected among two possible values (Fig. 6.15) as 


(6.59) 


xX WS y x, S 
dmt a A Lem 


Xe WS vy XS ; 
E , — „Lif L D/2 6.60 
(3 + D 2 D ) i > xD/ (6.60) 


) if L<nxD/2 


The above choice causes the chain to orient along an anticlockwise circular path 
towards the target, and to have positive values for the joint angles for both the cases 
shown in Fig. 6.14. 

The joint angles for the two-dimensional case are computed as follows. The base 
link’s joint angle 0, is measured with respect to the x-axis, and is given by 
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Fig. 6.14 Circular alignment of joints 


L>nD/2 


Fig. 6.15 (a) Two possible orientations of the joint chain for a given target position. (b) Extension 


of the inverse kinematics solution to three dimensions 


-1 f Ye f x 
6, = tan™! | — ——— 
i an (2) £ 2 


All remaining joint angles have the same value given by 


0; = B/n, i —2...n. 


(6.61) 


(6.62) 


The approach detailed above can be extended to three dimensions where the 
target position is given by T = (xs, Yr, zi). The problem is first reduced to two 
dimensions by transforming the target location to the xy-plane, and computing the 
joint angles as described previously. The transformed target position is 


J i 
x= x2 +2 


y =) 


(6.63) 
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After computing the joint angles, the whole chain is rotated about the y-axis by an 
angle — $ as shown in Fig. 6.15b to achieve the desired configuration. The rotation 
angle $ can be computed as tan ! (z/x;). This rotation can be combined with the 
joint angle rotation of the base link Pı. We can add one more degree of freedom to 
the link by allowing the chain to rotate about the line joining the base and the target, 
thus varying the direction in which the end effector approaches the target. 


6.8 Summary 


This chapter discussed forward and inverse kinematics equations for serial links 
containing only revolute or spherical joints. Such joint chains are commonly used in 
computer graphics for skeletal animation. Forward kinematics equations are used 
to compute the position of the end effector, given the joint angles. The chapter 
presented methods for computing the linear velocity of the end effector as a function 
of angular velocities of the joints. Both Euler angle and quaternion based definitions 
of rotations were considered. In the most general case, when the end effector 
coordinates are expressed as functions of joint angles, the Jacobian matrix defines 
the relationship between the linear and angular velocities. 

Inverse kinematics (IK) solutions can have singularities for redundant manipu- 
lators. The inverse Jacobian in the general IK solution is calculated in terms of the 
pseudo-inverse obtained using methods such as the singular value decomposition. If 
the distance between the end effector position and target is large, iterative numerical 
techniques are often used for a more accurate solution that converges to the target 
position. This chapter also outlined the cyclic coordinate descent and the circular 
alignment algorithms that are useful for animating joint chains. 

The next chapter introduces parametrically generated curves and surfaces and 
discusses their applications in computer graphics. 


6.9 Supplementary Material for Chap. 6 


The folder Chaptere6/Code on the companion website contains the following 
programs demonstrating the working of inverse kinematics algorithms. 


1. IK CCD.cpp 


Additional files: 
none 
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The program shows the working of the cyclic coordinate descent algorithm 
(Sect. 6.6) in transforming a 4-link chain. Target positions can be interactively 
specified using mouse clicks. Pressing the space bar updates the display, showing 
the next step in the sequence of rotations performed on the joint chain. The target 
vector and the end effector vectors are also drawn to show the amount of rotation 
in each step. 


2. IK CAA.cpp 


Additional files: 


The program displays a 10-link joint chain that aligns along a circular path 
to reach a target position. Target positions can be interactively specified using 
mouse clicks. The circular alignment algorithm was discussed in Sect. 6.7. 


6.10 Bibliographical Notes 


Kinematic analysis is an integral part of robotic systems, and most of the important 
references on the topic can be found in the area of serial manipulators and multi- 
body systems. Bottema and Roth (1979), Crane and Duffy (1998), and Jazar (2010) 
are just a few among many excellent books that provide a detailed description of the 
theory of kinematic manipulators, forward kinematics equations, and several types 
of inverse kinematic solutions. Orin and Schrader (1984), Maciejewski and Klein 
(1989) discuss the solutions based on Jacobian inverses. 

In early 1980s, Korein and Badler (1982) proposed inverse kinematics solutions 
for goal directed motion of articulated character models. A comprehensive coverage 
of kinematics algorithms that are useful in computer animation of character models 
can be found in Parent (2002) and Yamane (2010). The cyclic coordinate descent 
(CCD) algorithm was introduced by Chris Welman in his Masters thesis (Welman 
1989). An overview of this algorithm and its implementation can also be found in 
Lander (1998). An fast iterative solver for animating character models was recently 
introduced by A. Aristidou (2011). The circular alignment algorithm was also 
recently introduced by O. Cardwell (201 1). 
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Chapter 7 
Curves and Surfaces 


Overview 


In computer graphics, blending curves and surfaces are widely used for both 
interpolation and approximation. We have previously seen the application of 
Hermite polynomials in vertex blending, and Catmull-Rom splines for keyframe 
interpolation. Spline curves and surfaces also find applications in the interactive 
design of three-dimensional models. 

This chapter gives an overview of polynomial interpolation methods, and the 
construction of splines using different types of piecewise cubic polynomial curves. 
Design aspects such as local control, flexibility and parametric continuity are 
discussed in detail. Surface design techniques using two-dimensional Bezier and 
B-spline surface patches are also presented. Extensions of these methods using 
rational basis functions are then outlined. 


7.1 Polynomial Interpolation 


Suppose we are given n points (xi, yj), i— 1...n, on the xy-plane where all x;s 
are distinct. The polynomial interpolation theorem states that there is a unique 
polynomial f(x) of degree n — 1 such that 


The above equation shows that the polynomial curve given by y — f(x) passes 
through all the n points. Such a curve that passes through all input points is called 
an interpolating curve. On the other hand, a curve that passes through only a few of 
the input points is called an approximating curve. The Bezier spline (see Box 2.4, 
Sect. 2.7) is an example of an approximating curve. 


R. Mukundan, Advanced Methods in Computer Graphics: With examples in OpenGL, 139 
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(13, 7) 


(8, 0) 


Fig. 7.1 Polynomial interpolation curves of (a) degree 3, and (b) degree 6 


Consider the polynomial of degree n — 1 given by 


(x — x3)(x — xa)...(x — Xn) 


(7.2) 
(x1 — x2) (x1 — x3)...(x1 — Xn) 


C(x) = 


The above function attains a value 1 if x= xj, and O0 if x 2 x,,..., Xn. We 
can therefore combine such polynomials to form the required interpolating 
polynomial f(x): 


f(x) = cy (x) y + ca(x)y2 +. + es(xX)yn (7.3) 


The polynomials c;(x) are the Lagrange polynomials of degree n — 1 given by 


a) = ie = (1.4) 


(x; — x) — Xx) 


Hes 


As an example, four points (3, 4), (5, 5), (8, 0), (13, 7) are used to construct 
a cubic polynomial curve in Fig. 7.1a. Another interpolating curve through seven 
points is shown in Fig. 7.1b. 

Interpolation curves of degree higher than three can potentially have large 
overshoots (marked ‘A’ in Fig. 7.1b), or undesirable oscillations (marked ‘B’ in 
Fig. 7.1b). Such curves, even though they pass through all the user input points, may 
not describe the shape represented by those points. Piecewise polynomial curves of 
a low degree are therefore commonly used for approximating shapes. 

The system of equations in Eq. 7.1 can also be written as a matrix equation 
Y = VA: 
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yı lx: xr do 
y2 1 x 5 ai 
S eS . : (7.5) 
Yn 1 x, ... 3 d, 
where the polynomial is assumed to have the form 
f(x) = ag + aix + aox? +... + üs-iX | (7.6) 


The coefficients a; of the polynomial can be obtained by taking the matrix 
inverse: A= V ` !Y. The n x n matrix V is called the Vandermonde matrix. Since 
xis are all distinct, this matrix is invertible. 

We now look at a simple and efficient method for evaluating polynomials of the 
form in Eq. 7.6. If we use the formula x* = x.x* ^! to compute the powers of x 
for evaluating the terms of the polynomial from left to right, we need to perform 
2(n — 1) multiplications and n — 1 additions. The Horner's method is used to 
reduce the number of multiplications by rearranging the polynomial as a nested 
set of expressions: 


f(x) — ao + x(a, + x(a» +... + X(Qn—2 + Xaàg—4)...)) (7.7) 


Each nested sub-expression in the above equation requires one multiplication and 
one addition. Evaluating the polynomial from the innermost expression requires a 
total of only n — 1 multiplications. 


7.2 Cubic Parametric Curves 


Cubic polynomials have the advantage that they can be easily evaluated and used to 
generate small curve segments of an interpolating spline with sufficient flexibility. 
A cubic polynomial curve can meet four constraints simultaneously such as the 
requirement to pass through four distinct points, or a requirement to pass through 
two points and have user specified tangent directions at those points. Splines 
commonly use parametric representations of piecewise cubic curves defined using 
three polynomials in a single parameter t: 


x(t) = ao + ait + ast? + aat? 


y(t) = bo + bit + dot? + by? 
z(t) = co + eit + eot? + ca (7.8) 
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The above polynomials are called the x-polynomial, y-polynomial and 
z-polynomial respectively. The parameter ¢ usually varies from 0 to 1, with 
each value of t corresponding to a single point P(t) = (x(t), y(t), z(t)) on the 
curve. The polynomials thus define a mapping from an interval in the one- 
dimensional parameter space to a set of points in the three-dimensional space. 
A common example is where f represents time, and P(t) the position of a moving 
point at that instant. The equation for x(f) given above can be re-written as 
follows: 


xs [lt e e] =TA (7.9) 


The polynomial coefficients aj, bj, c; are computed using a set of control 
points and continuity constraints. As an example, consider the requirement that the 
cubic curve needs to pass through four distinct points P; = (xi, yj, zi), i— 1...4. 
If t; denotes the values of the parameter t corresponding to the four control points, 
we have 


Xi »|u du do 
X2 Lee se | | a 
x| |ine 22] | @ S) 
X4 1 t t2 8 || a 


This equation is the cubic version of Eq. 7.5. The 4 x 4 Vandermonde matrix is 
invertible if all t;s are distinct. We write this equation in a concise form as G, = VA, 
or equivalently as A = V^ 1G., where G, is a column vector containing only the 
x-coordinates of the control points. The inverse V ^! of the Vandermonde matrix 
can be computed as the product UL, where U is the following upper triangular 
matrix 


] -t tito —ty lols 
1 —(ti- t2) th +h + tt 

7.11 
0 1 —(ti + fo + ts) ( ) 


0 0 1 


o o o 
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and L is a lower triangular matrix given by 


1 0 


1 1 
(5) (5) 
uE-- NE —— 
(ti — t2)(& — 13) (t — t1) (tz — 13) 


1 1 
(z — t5)(ti — t) (ti — 3) (c — fy)(to — t3) (t — 3) 


0 0 
0 0 


1 
(c — tj)(ts — 5) 


1 1 
(= — tj)(ts — hb) (3 — 5) (c — ty)(t4 — t2) (t4 — 5) 


For example, if the parametric values are equally spaced in the interval [0, 1], 
so that £j = 0, t; = 1/3, t3 = 2/3, t4 = 1, then we have the following values for 


VandV- : 


(7.12) 


© 


1 0 0 0 


3 
= -l = = — 
_= {A(S (23 n 9 (+) 18 (>) 
2b 2 2 


(rod d (=) 6) 62) G) 


(7.13) 


From Eq. 7.9, we now have 
x(t) = TV 'G, (7.14) 


The product TV ^ ! is a row vector containing four functions of the parameter t. 
Thus the above equation can be rewritten as 


xt) = AA, A}, AO, AOG (7.15) 


144 7 Curves and Surfaces 


M60 65 nuwuosvN 


Fig. 7.2 Piecewise cubic interpolation polynomials constructed using groups of four points 


For the example in Eq. 7.13, we have 


af! $12) 
fit) 21 (z) 9 (5): 
gf \ea (27 
h(t) =9t (2): +Z) 
sel NE iy 
A) = BE (2): 
fat) 2t— (5) f (5) t? (7.16) 


The functions f(t) are called blending polynomials. Note that the sum of the 
above functions is 1 for all values of t. Generalising Eq. 7.15, and since the blending 
polynomials are common for x, y, and z axes, we find that 


P(t) = [fi(t), AM. AM. AM] (7.17) 


oS 


P4 


We can thus write the parametric equation for the cubic curve as a combination 
of the control points: 


P(t) 2 fit) Pp + (t) P» - ft) Ps fa(t)P4, O<t<l. (7.18) 


Figure 7.2 shows a set of points joined together using piecewise cubic polynomial 
curves through groups of four points, constructed using the above equation. Each 
cubic polynomial curve is called a segment. 

The matrix V ^ ! is sometimes denoted by M, and referred to as the basis matrix. 
With this notation, the blending functions and the basis matrix are related as follows: 


[fi (t), RO, AM, fA] = TM (7.19) 
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The points where the polynomial curves meet are called knots. It is often 
desirable to have tangential and higher order continuity at the knots. Such curves 
are called splines. In the next section, we discuss different orders of continuity 
constraints that can be used in the design of interpolating curves and surfaces. 


7.3 Parametric Continuity 


In the previous section we saw an example (Fig. 7.2) of a set of piecewise cubic 
curves joined together to form a single "continuous" curve. Clearly we require 
higher levels of continuity at the points where two curves meet, in order to get a 
smooth transition from one polynomial curve on to another. 

A parametric curve defined using cubic polynomials as in Eq. 7.8 has the property 
that the first and second order derivatives exist and are continuous over the interval 
in which the curve is defined. Two parametric curves P4(f) = (x4(t), yA(f), z4(f)) and 
Pp(t) = (xp(t), yp(t), zp(t)) are said to have c continuity if they meet at a common 
point M (Fig. 7.3). That is, there exits valid parametric values f, t» such that 


M = (x4(ti). yai). za(tı)) = (xg (4), yg (t2), zg (t2)) (7.20) 


If the tangents to the two curves at M also coincide, then the curves have 
C! continuity. The tangent direction at M is obtained by differentiating the cubic 
polynomials with respect to t, and substituting the parametric value for the knot M. 
We use the following notation for the derivatives of the x-polynomial in Eq. 7.8: 


dx a(t 
x4 (tı) -( xal J = a; + 2a»ti + 3asti? 
dt ELIT 
d?x4(t 
xa^ (ti) (X, = 2a» + 6asfi (7.21) 
dt t=t 


with similar notations for the y-polynomial and the z-polynomial. The vector (x,4'(f), 
ya (t), za‘(t)) gives the tangent direction on the curve A at point P(t). If t denotes 
time, then this vector represents the velocity of the point P as it moves along the 
curve A. C! continuity implies that the velocity of P considered as a point on 
the curve A at the knot M is the same as its velocity when considered as a point 
on the curve B: 


(xa (t), y4 (61). z4 t) = Gn (t), yg (t), zg (tr) (7.22) 


If two curves are joined with C! continuity, the point P(t) will have at most finite 
acceleration as it crosses the knot M. Second order continuity denoted by C? requires 
that the second derivatives of both curves at M are equal. That is, 


(xa (t). y 4" (8). z4" (8)) = Ga" (b), ya" (tr), zg” (02)) (7.23) 
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Fig. 7.3 Examples of piecewise cubic curves with different orders of parametric and geometric 
continuity 


T 


The above vectors represent the curvature at M, or equivalently acceleration 
of the point P(t) if t denotes time. The continuity constraints discussed above 
are often relaxed to just smoothness constraints that define only the important 
shape characteristics used for constructing splines. For example, the requirement in 
Eq. 7.22 to have the same tangent vector for both curves at the joint can be relaxed 
to the condition that the tangent vectors are just parallel, with possibly unequal 
magnitudes. The modified constraint can be written as 


(xa (t). y4 (ti), z4 (t1)) = BG (h), yg (2). zg (t2)) (7.24) 


for some constant f. Two curves satisfying the above equation are said to have 
a geometric continuity G! at the common point M. Note that we can always 
re-parameterize the curve A by substituting t = fu in its equation, and the resultant 
tangent vectors at M would still be equal, satisfying the C! continuity constraint. 
The geometric continuity G? is also similarly defined by introducing a constant 
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of proportionality in Eq. 7.23. The difference between parametric and geometric 
continuity is illustrated through an example in Fig. 7.3. 

In column (a) of Fig. 7.3, the curves A and B meet at M with C? continuity. 
The first and the second derivatives of the curves do not meet at the corresponding 
point. Column (b) shows the curves with C! continuity at M where the tangent 
vectors are equal. Correspondingly, the first derivatives of the curves meet at a point. 
The curves formed using second derivatives are discontinuous. Column (c) shows 
the curves with G! continuity where the tangent vectors at M are only parallel but 
unequal in magnitude. The first derivatives of the curves therefore do not meet at 
the corresponding point. In column (d), the curves meet with C? continuity at M. 
In this case, the first derivatives meet at a common point with C! continuity. The 
second derivatives of the curves also meet with C? continuity. Note that the second 
derivatives of cubic polynomial curves are always straight lines. 


7.4 Hermite Splines 


Hermite splines are cubic polynomial interpolation curves passing through two 
control points P, = (x1, y1, zi) and P5 = (x5, y2, z2), with the additional requirement 
that the curve is tangential to the specified directions at the two end points (Fig. 7.4). 

In Fig. 7.4a, the required tangent directions at the end points are denoted by 
m, and m» with components (xj, yi^, zi) and (x»', y»', z2") respectively. For the 
interpolating curve, we use the parametric equation given in Eq. 7.8. The control 
point P, corresponds to the parameter value of 0, and P2 corresponds to t= 1. The 
tangent vector components at t are given by 


x'(t) = a1 + 2azt + 3ast? 


y'(t) =b,+2bot + 3b3t7 
Z(t) = c1 + 2eot + 30387 (7.25) 


m, 


Fig. 7.4 Hermite polynomial 
interpolation 
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Similar to Eq. 7.10, we can now write an equation using position coordinates and 
tangent vector components: 


xi Lee dy ao 
X2 lee B | 
xi| |0125n 32 | a; Cie) 
x's 0 1 2h 32 | | as 


Substituting the parameter values for the end points in the above equation, 
we have 


x1 1000 do 
x2 bd xxu 
xi| |0100]la Shen) 
x's C123) a 


The basis matrix for Hermite polynomial interpolation is the inverse of the 
4 x 4 matrix in the above equation, and is given by 


100 0 
0 0 1 0 

My = 7.2 

2d -3 3 —2 -1 eee) 
2 3 1 1 


Pre-multiplying the above matrix by T = [l, £, 2, ?], we get the blending 
functions f;(f) (see Eq. 7.19): 


fi(t) 21-328 +287 

ft) = 30 — 22° 

fA) =t-2° 4 e 

A@) = -?4+r (7.29) 


From the above expressions, we get the parametric equation for the Hermite 
polynomial curve: 


P(t)=(1 — 3t? + 227) P, + G0 — 217) P; + (t — 207 + £7) mi 
+ (-? +0)m2, (<t <1). (7.30) 
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Fig. 7.5 Hermite interpolation spline 


The tangent vectors mı and m» can have arbitrary magnitude if we require only 
G! continuity at the end points when two curves are joined together. Increasing 
the magnitude causes the curve to align closer to the tangent direction. A scale 
parameter o > 0 for the tangents is introduced into this equation to control the shape 
of the cubic curve: 


P(t) «(1 = 30 --2DP) P, + Gi? —20) 5, + (t — 2t + tam 
+ (-1? + Byam (7.31) 


o is sometimes referred to as the tension parameter of the curve. An example 
with four different values of o is shown in Fig. 7.4b. Note that when a = 0, the 
above equation represents a linear interpolation between P, and P5. 

Given n points (n > 2), we can develop an interpolating spline that passes through 
all the points by constructing Hermite cubic curves for every consecutive pair of 
points. The tangent direction at each knot must be carefully specified by the user in 
such a way that it corresponds to the tangents to curves on both sides of the knot. 

In Fig. 7.5, piecewise Hermite polynomial curves are fitted through a set of 
points. The points are the same as the knots of the interpolation curve shown in 
Fig. 7.2. The common tangent vectors are all defined as parallel to negative y-axis. 


7.5 Cardinal Splines 


A cardinal spline is a smooth piecewise cubic polynomial curve that passes through 
every point except the first and the last in a given set of control points, maintaining 
first-order continuity at every point. A cardinal spline works very much like a 
Hermite spline with the exception that the tangent directions are not specified by 
the user but derived from the control points themselves. 

Consider a set of four control points Po, Pı, P2, P3 as shown in Fig. 7.6. The 
tangent at P, is specified in the direction of the vector P2 — Po, and the tangent at P5 
in the direction of the vector P4 — Pı. We can now use Eq. 7.31 with m, = P» — Po, 
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Fig. 7.6 A cardinal spline definition using four points 


and m» = P3 — P; to generate a Hermite cubic polynomial curve between P, and 
P2. The scaling parameter o controls the tension of the curve. Without any reference 
to the tangent directions, the curve's equation can be rewritten as a function of the 
control points alone as below. 


P(t) = (-t + 2? — tJa Po + (0 + (a 3)? + (2— œ)t?) P, 
+(at + (3 — 20)? + (a — 2)t?) P; + (t? + Ba P, (7.32) 


Writing the coefficients of 1, t, E, P of each blending function in the above 
equation as columns of a 4 x 4 matrix, we obtain the basis matrix for cardinal 
splines: 


—a 0 a 0 
M, = i 0. 7. 
2a a-—3 3—24 -—a «> (raa) 


—& 2-a a-2 a 


Given a set of n+ 2 control points (Po, Pi, ..., Pn, Pn+1}, n» 1, we can fit 
a cubic curve with the above basis matrix to every pair of consecutive control 
points (Py, Pk+1), 1 X k «n, with tangent vectors defined as m, = Pri) — Pii, 
and mg+1ı = Py» — Py. In other words we need to process overlapping blocks of 
four control points [P,— 1, Pk, Pk+1, Pk+2], with only the middle two points used 
for interpolation at a time. 

When a — 0.5, we get a special case of cardinal splines called Catmull-Rom 
splines. It directly follows from Eq. 7.33 that Catmull-Rom splines are given by the 
parametric equation: 


0 1 0 0 Po 
—0.5 0 0.5 0 P, 
1 -25 2 -0.5 P» 
—05 1.5 -15 05 P3 


PO=|1t P £] (7.34) 
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Nowe RWS 


Fig. 7.7 A Catmull-Rom spline through a set of control points 


Figure 7.7 shows a Catmull-Rom spline generated using a set of control points. 
Compare this figure with the piecewise cubic spline in Fig. 7.2 where the same set 
of control points was used. 


7.6 Bezier Curves 


Bezier splines are approximating curves generated using Bernstein polynomials as 
the blending functions (see Box 2.4, Sect. 2.7). Denoting n+ 1 control points by 
P, ... P444, the parametric representation of the nth degree Bezier curve is given by 


P(t) =) Bin) Pia. (7.35) 


i=0 


where, f;,(ft) denotes Bernstein polynomials of degree n. Since Bernstein polyno- 
mials always yield non-negative values for 0 < f < 1, and form a partition of unity, 
every point on a Bezier curve is a convex combination of the control points. In 
this section, we discuss the construction of piecewise cubic splines using Bezier 
curves, and outline an important algorithm that will be later extended to develop the 
framework for B-splines. 


7.6.1 Cubic Bezier Splines 


The parametric equation of the cubic Bezier curve is given by 


P(t) -(1 2t Pj -3t((0 - t P5 -32(0 —0)) P P4, 0t <1. 
—(1 — 3t + 30? — t°) P, + (8t — 6t? + 372) Py + (302 — 30) P + t? Py 
(7.36) 
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Fig. 7.8 Cubic Bezier curves 


where P, ... P4 are the control points. The Bezier spline interpolates between the 
first and the last control points. The two middle control points are used to define the 
tangent directions at the end points. In the matrix form, the cubic Bezier curve is 
given as 


1 0 0 JFR 
PO=[1 2 e e] É y E (7.37) 
T 3 
=) 3 -3 LI LB 


Differentiating Eq. 7.36 with respect to t, we get the tangent directions on the 
Bezier curve: 


P'(t) = (—3 + 6t — 312) P, + (3 — 12t + 9) P; + (6t — 90) P4 + 3 P, 
(7.38) 


From the above equation, the tangent directions at P, and P4 are obtained as 
follows: 


P'(0) = 3(P2 — Pi) 
P'(1) = 3(P4 — Ps) (7.39) 


The control points and the tangent directions are shown in Fig. 7.8a. Clearly, 
the Bezier cubic curve is a special case of the Hermite polynomial curve where 
mı =3(P2 — Pı) and mz = 3(P4 — P3). The following equation relates the input 
vector [P1, P», mı, m2] of a Hermite curve as given in Eq. 7.30, with the 
input vector [P1, P», Ps, P4] of the Bezier curve, so that the resulting splines 
coincide. 
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Naowrprowuns 


Fig. 7.9 A Bezier spline passing through a set of control points 


Pi 1 0 0 0 Pi 

P5 _ 0 0 0 1 P (7.40) 
m, -3 3 0 || l 
m, Hermite 0 0-3 3 P4 Bezier 


Given a set of n control points P4, . . . Pa, the Bezier spline consisting of piecewise 
cubic polynomial curves can be made to pass through the first and every fourth 
point P3kx+1, k=0, 1, 2.... The remaining points are used for specifying tangent 
directions. For G! continuity of the spline, we need to make sure that the three 
points P3k, P3k+1, P3k+2 are collinear for k= 1, 2,.... An example of a piecewise 
cubic Bezier spline satisfying this condition is shown in Fig. 7.9. The knot positions 
are the same as those used earlier in Figs. 7.2 and 7.5. 

Bezier splines are widely used in computer graphics and therefore graphics 
packages commonly support methods for creating Bezier curves of different orders. 
We could also make use of the functionality provided by such libraries for 
generating other types of splines (Hermite, Catmull-Rom etc.), if we can compute 
the Bezier equivalent set of control points for the required spline. As an example, 
by computing the inverse of the 4 x 4 matrix in Eq. 7.40, we can obtain the Bezier 
control points for the required Hermite curve as follows: 


Pi 10 0 0 Pi 
P, 10(}) 0 P, 

= 7.41 
P; 010 —(2) |] m vient 
P4 01 0 0 m» 


Bezier Hermite 


In a general case, we express the parametric curve P(t) in terms of the required 
spline's basis (denoted by Ms) as well as the Bezier basis as 


P 


P(t) = TMs.. P; = TM; 
P; 


P4 


P 
P; 

7.42 
Py (7.42) 
P, 


Bezier 
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from which we obtain 


Pi Pi 
P; =j P, 
=M,.M 7.43 
P; Bezi4S P, ( ) 
Py Bezier Py S 
where 
1 O0 0 0 
z 1 (4) 0 0 
Mj = 3 (7.44) 
= JLG) G) 0 
1 1 1 1 


The above matrix is the inverse of the 4 x 4 matrix in Eq. 7.37. 


7.6.2 de-Casteljau’s Algorithm 


The de-Casteljau’s algorithm provides an alternative representation of a Bezier 
curve in terms of a combination of linear interpolation functions. Given three control 
points P1, P2, P3, we can construct parametric equations of two straight lines 


Palt) = (0 —1)Pi + tP 
Pa (t) = (1 — t) P + t P3 (7.45) 


For each parameter value f € [0, 1], the above equations give two points. We now 
further interpolate between these two points using the same parameter value: 


P(t) = (1- t) Pu t tP5 (7.46) 


The resulting point will lie on the quadratic Bezier curve generated using the 
control points Pı, P2 and P3. This can be easily proved by substituting for Pı; and 
Pa; from Eq. 7.45 in the above equation: 


P(t) (1 — t) (1 £2) P1 +t Po} + (0 — t) Pa +t P3) 
=(1 — t)? P, +241 0) P5 + t° P (7.47) 


Figure 7.10 shows the geometrical interpretation of the above equation. Using 
the same method, we can obtain the cubic Bezier curve from four control points 
(Fig. 7.8b). Using a parameter value in the range [0, 1], we interpolate between 
consecutive pairs of control points to get three points, further interpolate between 
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Fig. 7.10 Interpolation between three control points using de-Casteljau’s algorithm 


Pi P3 P3 P4 


Level 1 


Level 2 


(l-t) t 


P(t) =P 13 Level 3 


Fig. 7.11 Iteration sequence for de-Casteljau’s algorithm with four control points 


them to get two points, and again interpolate between the two points to get a single 
point on the cubic curve. This interpolation sequence is shown in Fig. 7.11. The 
whole process is repeated for the next parameter value. 

The de-Casteljau’s algorithm for a general n — | degree Bezier curve with control 
points P, ... P, can be written as follows: 


Pkalt) = (0 — t)Pea-iQ) + tPexiait) | Oxtz Lk-l.n-d. 
Py o(t) = Px, k — lun. 
P(t) = Pis. (7.48) 
For the above iteration, the index d is varied from 1 to n — 1, and for each d the 
index k is varied from 1 to n — d. After each level of iteration (see Fig. 7.11), the 


number of points reduces by one. At level n — 1, we get a single point P;,,  ; which 
lies on the Bezier curve P(t) of degree n — 1. 
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Fig. 7.12 (a) Effect of varying homogeneous coordinates on Bezier curve. (b) Conic sections 
formed using rational Bezier curves 


7.6.3 Rational Bezier Curves 


Rational Bezier curves are formed using control points specified in homogeneous 
coordinates. A three-dimensional point P — (x, y, z) has an equivalent homogeneous 
representation (xh, yh, zh, h), h # 0 (see Box 2.1). The Bezier curve equation in 
Eq. 7.35 is applied to each of the components, and correspondingly every point P(t) 
also gets a fourth component. The x, y, and z coordinates of P(t) are divided by 
its fourth component to get the Cartesian coordinates. The additional parameter h 
acts as a weight that can be adjusted to change the shape of the curve. An example 
showing the variation of a cubic curve's shape for three equivalent representations of 
the control point P5 is given in Fig. 7.12a. In this 2-D example, the third component 
is the homogeneous coordinate h. 

The homogeneous coordinate system also allows the representation of points at 
infinity, by setting the last component to zero. Defining a control point at infinity 
causes the control polygonal line to have disjoint and parallel edges. This feature is 
useful for the generation of conic sections using Bezier curves. Figure 7.12b shows 
a semi-circular arc and a semi-ellipsoidal arc formed using quadratic Bezier curves. 
Among the three control points Pı, P2, P3, the point P» is at infinity along the + y 
direction. The control polygonal line therefore degenerates into two parallel vertical 
lines meeting at P5. 


7.7 Polynomial Interpolants 


The parametric curves introduced in previous sections were all based on piecewise 
cubic polynomials and the points on each segment were generated by varying 
the parameter f from 0 to 1. In this section, we will develop the framework of a 
more general class of interpolating splines where ¢ can have an arbitrary range. 
First, we consider interpolating polynomials of degree one, two and three, and then 
generalize our results to an n — 1 degree polynomial passing through n control 
points. The ability to specify parameter values at control points provides added 
flexibility to the design of splines. 
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Given two points P4 = (xi, yi, zi), P2 = (%2, yo, Z2), and two values tı, t» of the 
parameter t such that t; < t2, the linear equation of the interpolating line between the 
points can be written as 


t — t t —t 
Pe) = — P + LP, 2teu (7.49) 
ta — fi f; = íi 


We denote the above polynomial as gı(P1, P»; tj, t2; t) with the control points and 
parameter values included in the function argument. The suffix of g indicates the 
degree of the polynomial. The first suffix of P,,(f) indicates the starting point on the 
spline (P1), and the second suffix the degree of the polynomial. Using this notation, 
the point P, itself can be represented as Po or a polynomial go(P1; tj; t). If we now 
add a third point P} to the set of control points, with an associated parameter ts 
(ti <t «€ t3), we can construct a quadratic curve that passes through the three points 
as follows: Similar to the previous equation, we first perform a linear interpolation 
between P» and P3: 


t3—t t — f» 
gi(P5, P3; to, t3; f) = Pat) = Py + P3, ty <t <t, (7.50) 
3 — 0 t3 — t5 


Then we combine the points P;;(¢) and P» (t) using a third interpolation formula 
with t varying from £4 to t3: 


t4—t t-t 
P(t) = Pi + P5, fl xt«lt (7.51) 
t3 — fı t3— 1i 


Substituting the expressions for Pı; and P», in the above equation, we get a 
quadratic polynomial which we denote as g2(P1, P2, P3; t1, t2, t3; £): 


g2 (Pi, Po, P3; ti, to, 15; t) = Pio(t), t Xt«t5 


BECRDICREA (4 — D (t-t) (—1)5—1) 
(h-hh) (ts Er T (ti — t2) (ts e ii (ti — t3) (t2 — t3) 
(7.52) 


Note that the above algorithm is a generalized version of the de-Casteljau's 
method outlined in the previous section. For Bezier curves, we used only values 
between 0 and 1. In the above equation, however, the parameter is allowed to vary 
over the range [t;, t3), which is the union of the two intervals [ti, t2) and [t», t3) that 
were used to generate the line segments. Since the intervals are disjoint, this would 
mean that any value of the parameter will always be outside the range of one of the 
intervals. This situation is shown in Fig. 7.13. Compare this process with that shown 
in Fig. 7.10, where the parameter value is restricted to the range [0, 1] along each 
interpolated direction. 

Figure 7.14 shows three quadratic splines generated using Eq. 7.52, all with the 
same set of control points P, = (1, 4), P2 = (3, 1), P3 = (6, 2). For the first curve (a), 
the parametric values used were 1; —2, t; = 5, and t4 — 8. Since the spacing of 
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P; (t=5) 


Fig. 7.14 Quadratic polynomial splines for different parameter values, but with the same control 
points 


values was uniform, the curve also has a nearly uniform tension across the points. In 
the second figure (b), the parameters were changed to t; = 2, fh = 3, and t3 = 8. The 
reduced spacing between f, and t is seen as a higher tension of the curve between 
P, and P5, closely approximating a straight line. Similarly, in the third figure (c), 
we reduced the spacing between f» and t3 by choosing f = 2, h = 6.5, and f = 8. 

The process outlined above can be extended to a larger set of n control 
points P, ...P, and n parameter values fj ... f, (fj <t « ... < t4). We start by 
combining every consecutive pair of control points as shown in Eq. 7.49, to form 
linear equations P11, P21, ... P4, 1. We then combine consecutive pairs of these 
polynomials as in Eq. 7.51 to form quadratic polynomials P12, P22, . . . Pn — 25. This 
process is iteratively continued till we get the polynomial P;,,— | of degree n — 1. 
By evaluating this polynomial by varying ¢ from f, to t,, we get the coordinates 
of points along the spline that passes through all the control points. The iterative 
procedure for four control points is illustrated in Fig. 7.15. 

Note that P; a(t) denotes a polynomial of degree d. There are n — d polynomials 
of degree d on level d (Fig. 7.15). The polynomial P; a(t) is formed by combining 
two polynomials from the previous level. 
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P, P, P, P, 
to-t t-t t-t t-t t4-t t- ty 
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Pyx(t) P(t) Level 2 
t4-t t-t 
t4- ty t4- ty 
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Py) 
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Fig. 7.15 Computation of a third degree interpolating spline using four control points 


fk pa — t t — fk 
Pk a(t) = ——— Pr a-1 + —— Petid-1, te € t X figa (7.53) 
tktd — tk tk+d — tk 


For the above iteration, d varies from 0 to n — 1, and for each d, k varies from 1 
to n — d. The initial conditions are set as 


Pro = Px, k=1...n. (7.54) 


The n — 1 degree parametric curve generated as above passes through all control 
points. Being a polynomial, it is differentiable up to order n — 1, and therefore has 
C" -! continuity at all points. However, the curve does not lie within the convex 
hull of the control points, as clearly seen from Fig. 7.14. In the next section, we 
introduce a popular approximating spline called B-spline, that satisfies the convex 
hull property, but does not pass through all control points. 


7.8 B-Splines 


In Fig. 7.14 we observed that interpolating polynomial curves of degree d use a 
union of parameter intervals used by the component polynomials of degree d — 1, 
causing points to fall outside the convex hull of the control points. Basis splines or 
B-splines are commonly used in CAD systems to create approximating splines that 
are entirely contained in the convex hull of the control points. In addition, B-splines 
of degree d provide C ^ ! continuity at the knots. 
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Fig. 7.16 Plot of B;,(1) 


7.6.1 Basis Functions 


B-splines are polynomials defined in the parameter space, where a sequence {t;}, 
i=1,...m, of non-decreasing values (i.e., fj X t; € ... < tm) of a parameter f are 
given. The list of parameter values is called a knot vector. B-splines are used as 
basis functions to combine a given set of control points to form an approximating 
spline. First, we will look at some important characteristics of B-splines. B-splines 
of the lowest degree are constant step functions defined using two parameter values 
as below. 


l, ifti <t<ti 
Biot) = f din (7.55) 
0, otherwise. 


The plot of B;o(t) for the knot vector (3, 5, 9,10} is shown in Fig. 7.16. 

The second subscript d of the B-spline Bia (t) denotes the degree of the 
polynomial. Basis polynomials of degree 1 and higher are defined using the 
following Cox de Boor recurrence formula: 


i=F t 
Bi a(t) = ———— B; a1 (t) + —————— 
liq — li lida — lil 


i+d+1— Í 
Biia- (t), ti €t Stites 


(7.56) 


To avoid division by zero, the conditions when t;+4 = t; and ftj+a41 = ti+1 are 
considered separately as follows: 


ti+d+1 — t 
B; a (t)=— == B41, 4-1 (t), if ti = tita 
li+4d+1 — lii 
[ — li i 
—— — — Bia- (t), if tii = fixa (7.57) 


litd — li 
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Fig. 7.17 Plot of B;(2) 


The above conditions do not arise in uniform B-splines where the knots are all 
equally spaced. From Eq. 7.56, we obtain the definition of first degree basis splines 


as follows: 
Ld; . 
— }5 if t; € t « lii 
tiga — li 


j. if lii X t < fiy CHO 


ti42 — i44 


0, otherwise. 


Note that B; (t) requires three knot values for each i, and is non-zero only in the 
interval [t;, t:+2). A plot of B-splines of degree one with the knot vector (3, 5, 9, 10} 
is shown in Fig. 7.17. 

From Eq. 7.56, we get the following equation for second degree B-splines: 

t— fi ti+3—t 
Biat) = — Bia) + — — — Bini) (7.59) 
[i42 — fi lj 3 — lil 

Substituting the values from Eq. 7.58 into the above equation, and taking into 

account the intervals where B;,;(t) and Bj4..1(f) are non-zero, we get 


(t— tuy 
(ti+2 — ti) tid — ti) 
(t — ti)(ti--2 — t) " (t — tii) (43-1) 
B;o(t) = 4 (ie2— titix2 tii) — (tis — titia — tig) 
(i3 — t? 
(t143 — ti D i-es — ti42)’ 


if ti € t «€ tia 


if ti+1 St < tig 


if tipa € t <ti+3 


0, otherwise. 


(7.60) 


The three non-zero sections of B1 2(t) as defined above, are shown in Fig. 7.18. 
The knot vector used for generating this figure is again (3, 5, 9, 10}. 
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Fig. 7.18 Plot of B;»(f) 
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Fig. 7.19 Recursive computation of B, 4(t) in terms of B-splines of lower degrees 


In general, a B-spline Bj, of degree d is defined using a non-decreasing sequence 


of d+ 2 knots {ti, GEIS s 


-> tita+1} and is non-zero only in the interval [f;, ti+4+1). 


The interval in which a function is non-zero is called its support. The diagram in 
Fig. 7.19 shows the recursive computation of B;4(t), and also the support of every 
intermediate polynomial that is evaluated. Comparing this diagram with Fig. 7.15, 
we see that the computations performed are very similar to those used by polynomial 


interpolants. 
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Fig. 7.20 Effect of movement of a control point on the approximating curve 


7.8.2 Approximating Curves 


We shall now look at ways of constructing approximating curves using a set of n 
control points P, ... Pa, and B-splines as the blending functions. Since the curve is 
not required to pass through all control points, we have a selection of polynomials 
of different degrees for blending functions. A parametric curve of degree d can be 
formed using n B-splines of degree d as follows: 


P(O = 9 BiBia(), tari St tea (7.61) 


i=l 


As seen earlier, the B-spline B; a(t) requires a knot vector consisting of a non- 
decreasing sequence of d + 2 knots {t;, ti+1, ..., tita+1}. Therefore, the summation 
in Eq. 7.61 requires n +d + 1 knots (ti, fo, ..., ti+a+1}. Note that the parametric 
curve is generated by varying t within the closed subinterval [t4+1, t,+1] only, even 
though other knot values outside this range may be required for computing the 
polynomial values. The end point of the parametric curve t = f,4 is a special point 
in the sense that the definition of B, o(r) is modified to accommodate the point as 
follows: 


B, o(tn41) =]. (7.62) 


The values of the knots can be adjusted while maintaining the non-decreasing 
order, to make fine local changes to the shape of the resulting curve. Another 
advantage of using B-splines as blending functions is that due to their local support, 
changes made to a control point will affect the curve only in the neighbourhood of 
the point. As an example, consider the situation when the control point Ps is changed 
in Eq. 7.61. Since Ps is multiplied by Bs a which is zero outside the interval [ts, 
t6+a), any change in the position of Ps will not affect the curve outside this interval. 
This property is depicted in Fig. 7.20, where a second degree approximating curve 
is generated using eight control points, and the position of Ps is shifted vertically 
downward by a small distance. The corresponding localized shift in the curve can 
be clearly observed in the figure. 
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Fig. 7.21 A second degree approximating curve through five control points 


We shall now look at the geometrical characteristics of shapes and the effects 
produced by varying knot positions. Given two points P, and P5, and setting d = 1, 
Eq. 7.61 gives the following equation for the interpolating line: 


t— fti tz — t 
P(t) = Bio(t) + B5o(t) | Pi 
to — ty tz — ty 
t—t t4— t 
+ ( = Beat) + — Baat) P, (7.63) 
tz — ty t4 — fà 


Note that the parameter t varies from f» to t, only (see Eq. 7.61). Therefore, 
the first term containing B, o(f) and the last term containing B59(f) vanish from the 
above equation, and B5 o(r) = 1. Thus we get the desired equation of the straight line 
connecting the two points: 


t3 — t t—t 
P(t) 2 J—-p,- ? P, (7.64) 
3 — ty i3 = 15 


In this case, the knots f, t4 do not affect the shape of the parametric line. We 
shall now consider another example with five control points P, ...Ps on a two- 
dimensional plane as shown in Fig. 7.21, and an approximating spline generated 
using second degree B-splines. Since we require a knot vector containing eight 
values, let us choose a uniformly spaced knot vector (10, 20, ..., 80j. The 
parameter interval for the curve is [ts, tg] (see Eq. 7.61). The points where the 
parameter ¢ attains the knot values on the curve are also indicated in the figure. 
These points are called knot points. Knot t is required for computing By 2, and ty is 
needed for B52. The remaining knots f; and tg do not affect the shape of the curve. 

A knot can be repeated multiple times in a knot vector. In the above example, 
the curve does not pass through the first and the last control points. However, for a 
closer approximation of the control polygonal line, it is often required to have the 
curve pass through the end points, and also have the corresponding line segments 
tangential to the curve. We saw earlier that Bezier curves satisfy this requirement. If 
the first and the last knots have multiplicity d + 1, then the approximating curve of 
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Fig. 7.22 Clamped knot vector 
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Fig. 7.23 A second degree curve generated using a clamped knot vector. Compare this with the 
curve in Fig. 7.21 


degree d generated using B-splines also meets this requirement. The knot vector of 
such a curve is said to be clamped (Fig. 7.22). 

If the knots have values in the range [0, 1], then the first d + 1 values are usually 
clamped to 0, and the last d 4- 1 values to 1. If the knot vector is clamped, it can be 
easily verified that 


By a(ta+1) = Baa-i(fai) = ... = Batio(tati) = 1 (7.65) 


and hence P(tg+1) = P,. Similarly, making use of the special condition in Eq. 7.62 
we can show that P(t,41) = Pn. The curve therefore passes through the first and the 
last control points. Figure 7.23 shows the modified version of the curve in Fig. 7.21, 
generated using the clamped knot vector {30, 30, 30, 40, 50, 60, 60, 60}. 

As the degree d of the curve is increased, it tends to move further away from 
the control points. However, the curve always remains within the convex hull of the 
control points. Figure 7.24 gives an example with eight control points and clamped 
knot vectors for three different values of d. As d is increased, the number of internal 
knots (n — d — 1) in the clamped knot vector reduces, and eventually becomes zero 
when d =n — 1. At this point, the B-spline curve degenerates into a Bezier curve. 
Specifically, a B-spline curve of degree d with d+ 1 control points and 2(d + 1) 
knots is actually a Bezier curve of degree d if the knot vector is clamped such that 


t=0, if 1<i<d+l 
=1, if d+2<i<2(d+1) (7.66) 
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Fig. 7.24 Approximating curves of different degrees for the same set of control points 


If a knot t; is repeated k times in the knot vector, then a spline function P(t) of 
degree d has continuous derivatives up to order d — m at t;. Thus if an internal knot 
has multiplicity d, then the curve will only have C? continuity at that knot value. If 
none of the knots has multiplicity greater than one, the curve has C" ! continuity at 
all points. This property and other features of B-spline curves such as local control, 
convex hull property and affine invariance make them suitable for a wide range of 
applications in computer graphics. 


7.8.3 NURBS 


Similar to rational Bezier curves (Sect. 7.6.3), the control points for a B-spline 
curve can be expressed in the homogeneous coordinate system, each containing 
an additional scale factor h. This modification causes the approximating curve's 
equation to have a rational form. Further, if the knot vector does not contain 
uniformly spaced values, then we have a Non-Uniform Rational Basis Spline, 
or NURBS, as it is commonly known in computer graphics literature. In the 
homogeneous coordinate space, control points P; = (xj, Yi, zi) can be expressed 
as (xii, Yihi, zihi, hj), hi # 0. The term h; acts as a scalar weight for each point, 
providing an extra level of control over the shape of the spline curve. The parametric 
equation of the spline curve in Eq. 7.61 now becomes 


Y hi Pi Biat) 
PO = =, Unt ty (1.67) 
Y hi Bia (t) 


i=l 


If the knot vector is clamped as given in Eq. 7.66, then the above equation 
yields a rational Bezier curve. As an example, a circular arc that subtends an angle 
20 at the centre can be generated by representing the middle control point P» in 
homogeneous coordinates with h = cos0. Suppose we require an arc between two 
points P, = (1, 0) and P3 = (3, 0), so that the subtended angle is 60° (0 = 30°). The 
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Fig. 7.25 Generation of circular arcs using NURBS 


Fig. 7.26 The surface of a 
light bulb modelled by 
revolving a B-spline curve 
about the y-axis 


second control point P2 must be at the position where the two tangents to the curve 
meet. Therefore, in this case, P2 = (2, tan0). The NURBS curve in Fig. 7.25a is 
generated by specifying the control points in homogeneous coordinates as (1, 0, 1), 
(2cos0, sinf, cos), and (3, 0, 1). Three circular arcs, each subtending an angle of 
120? at the centre, can be combined as shown in Fig. 7.25b to form a complete 
circle. 

B-splines and NURBS are widely used in the design of surfaces. A simple surface 
design method is to first model a spline curve on the xy-plane, and then revolve the 
curve about the y-axis to generate a surface of revolution (Fig. 7.26). The following 
sections discuss some of the important spline based surface generation techniques 
used in computer graphics. 


7.9 Surface Patches 


Surface patches are two-parameter analogues of curve segments that are defined 
using blending functions in two independent parameters and a set of control 
points. Linear, quadratic and cubic interpolation methods used for generating 
curve segments can be extended to bilinear, biquadratic and bicubic polynomial 
interpolation methods for constructing surface patches. Given four control points 
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Fig. 7.27 (a) A bilinear surface patch. (b) A bi-cubic patch formed using four control points 


Poo, Poi, Pio, P11 as in Fig. 7.27, a polygonal surface passing through them can be 
obtained by a bilinear interpolation between the points using two parameters u and 
v as follows: 


L(u, v) =(1 — v)((1 — u) Poo + uPio) + v((1 — u) Po + uPi1) 
—(1 — u)(1 — v) Poo + u(1 — v) Pio + (1 — u)vPoi + uv Pi, 
Ocxuxl, O<v<l. (7.68) 


The interpolating patch in this case is simply a quadrilateral surface element with 
straight edges connecting the control points. Hence the above equation is not very 
useful in surface design applications. 

We can use a general bi-cubic polynomial equation as given below for construct- 
ing an interpolating surface that passes through four control points: 


+. 3 
P(uv) Maj, 0suxl O<v<1. (7.69) 


i=0 j=0 


The above equation has 16 unknowns cj;, and requires 16 boundary conditions to 
provide a unique solution for the coefficients. These boundary conditions are formed 
using the four control points, the four tangent vectors along the u-direction at the 
points, the four tangent vectors along the y-direction, and the four twist vectors. 
We use the following notations for the partial derivatives of P(u, v). The first two 
derivatives give the tangent vectors along parametric directions, and the third gives 
the twist vector at any point (a, b): 


JP 3P ə P 
P,„(a, b) = E3 „=a Pb) = (2) um (5) u=a 
v=b L 


v=b v=b 


(7.70) 
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With the above notation, the boundary conditions can be written as follows: 


P(0,0) = coo = Poo 
P(0, 1) = coo + Cor + Coz + Cos = Poi 
P(1,0) = coo + €10 + C20 + c30 = Pio 


3 3 
PA, D=} > ey = Pu 


i=0 j=0 
P, (0,0) = cio 
P,(0,1) = cio + eu + C12 + €i 
P,(1,0) = cio + 2€20 + 3039 


3 3 3 
P,(1,1) = Xo cy +29 c; T3» o 
i=0 i=0 i=0 


P,(0, 0) = Col 
P,(0, 1) = cor + 2c02 + 303 
P,(1,0) = cor + €i + C21 + 631 


3 3 3 
P,Q, D= 3 ci 29 ca 39 c3 
i=0 i=0 i=0 


P(0, 0) = Ci 
Puy (0, 1) = c11 + 2c12 + 3015 
Py (1,0) = eii + 205 + 3031 


Pw, 1) = cu + 4622 + 9033 + 2651 + 2615 + 3031 + 3013 + 6623 + 603, 
(7.71) 


The bi-cubic surface patch obtained by solving the above linear system of 
equations is given by 


P(0,0) P(0,1) P,(0.0) P,(0, 1) fi) 
P(1,0) P(1,1 P,(1,0) P,(1.1 

Pv) = [Aled AO AG) A00] | pio) 246.1) ROO 720.0 | | ^) 
P,(1,0) P, 1) P,,(1,0) Pw, 1) Sav) 


(7.72) 


where the blending functions f;(u) are the Hermite polynomials given in Eq. 7.29. 
Figure 7.27b shows an example of a bi-cubic surface patch. 
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7.10 Coons Patches 


The interpolation methods discussed in the previous section use positions and 
derivatives defined at the control points as boundary conditions. A surface patch 
may be required to have curves with known equations along its four edges. Suppose 
four edge curves forming the boundary of a region are given by parametric functions 
C\(u), Co(u), Di(v), D2(v) as shown in Fig. 7.28. All four curves are defined over 
the same interval [0, 1]. 

At the corner points, the curves satisfy the conditions Poo = C4(0) = D,(0), 
Pio = Ci (1) = D2(0), Poy = C2(0) = Di(1), and Pj; = C2(1) = Do (1). By linearly 
interpolating between corresponding points of Cı(u) and C(u) using the second 
parameter v, we get the following ruled surface: 


Rc (u, v) = (1 — v)Ci(u) + vC»(u), O0<v<l (7.73) 


Similarly, interpolating between D;(v) and D»(v) using the parameter u, we get 
another ruled surface: 


Rp (u,v) = (1 — u) Di(v) + uD2(v), O<u<il (7.74) 


Figure 7.29a shows four Bezier curves surrounding a region in three-dimensional 
space. The corresponding ruled surfaces generated by the two equations given above 
are shown in Fig. 7.29b, c. Each ruled surface follows the shape of the bounding 
curves along one parametric direction. 

The bilinear Coons patch bounded by the four parametric curves is obtained by 
adding together the above two ruled surfaces and subtracting the surface obtained 
from Eq. 7.68: 


P(u,v) = Rc(u,v) + Rp(u,v) — L(u,v), O<u<l, O<v<1_ (7.75) 


Figure 7.30 shows the surface patch produced by applying the above equation in 
the example given in Fig. 7.29. 


Fig. 7.28 A region for a 
surface patch specified using 
four bounding curves 
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a b c 


Fig. 7.29 (a) A region specified by four bounding curves. (b) Ruled surface Rc(u,v). (c) Ruled 
surface Rp(u,v) 


Fig. 7.30 Bilinear Coons 
patch corresponding to the set 
of curves in Fig. 7.29a 


It can be easily verified that the surface patch P(u, v) satisfies the desired 
boundary conditions: 


P(u,0) = Ci(u), P(u, 1) = Co(u), P(0,v) = Di(v), PU, v) = Do(v) (7.76) 


Generally the derivatives along the parametric directions of bilinear Coons 
patches are not always continuous, and hence the surface patches do not join 
smoothly along a common edge curve. Bi-cubic interpolants are used to obtain first 
order geometric continuity along joining curves. A bi-cubic Coons patch is a smooth 
blending surface created by using Hermite polynomials (see Eq. 7.29) instead of 
linear interpolants: 


P(u, v) — fi(v)Ci(u) + f2(v)Co(u) + fi(u) Div) + f(u) Dav), 
— fi) (A@MPu + fo(u) Pio) — AW) (A) Pa + fou) P22) 
(7.77) 


where, fi(u) 21 — 3i? + 21), and fo(u) = 3u? — 24°. 
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7.11 Bi-Cubic Bezier Patches 


In this section, we consider the extension of cubic Bezier curve segments to Bezier 
surface patches. The general Bezier equation in Eq. 7.35 can be extended to a two- 
parameter surface equation as 


n m 


P(u,v) — > XO Bin (u)Bjm(v) Piana. O<u<1, O<v<l. (7.78) 


i=0 j=0 


The two-dimensional array of points Pj, i=1...n+1,j=1...m-+1 forms a 
control polygonal surface. As a special case of the above, the bi-cubic Bezier patch 
is defined using a topologically quadrilateral arrangement of 16 control points Pj, 
i=1...4,j7 = 1...4 (Fig. 7.31): 

Setting m = n = 3 in Eq. 7.78, we get 


3 3 
P(u,v) = b XO Bis Bia) Pia (7.79) 


i=0 j=0 


where, Bo3(u)=(1 — u}, fis(u)—3u(1 — uY, foa(u)— 3w(1 — u) and 
fsa(u) =u’. 

A bi-cubic Bezier patch has several desirable properties that makes it suitable 
for surface design applications (Fig. 7.32). From Eq. 7.78, it can be seen that 
P(0, 0) = P11, P(1, 0) = P41, P(O, 1) = P14, and P(1, 1) = P44. Thus, the four corner 
points of the control polygonal surface lie on the Bezier patch. It can also be 
observed that 


3 
Pu, 0) = M piP | Ou. (7.80) 


i=0 


The above equation shows that P(u, 0) is a cubic Bezier curve formed using the 
control points P11, P21, P3; and P41. Similarly, we can prove that the remaining edge 
curves of the surface patch are also Bezier curves. In fact, for any constant c, both 
P(u, c) and P(c, v) are cubic Bezier curves. 


v Pp Pj Pa 


Fig. 7.31 A control polygonal surface for a bi-cubic Bezier patch 


7.11  Bi-Cubic Bezier Patches 173 


Fig. 7.32 A Bezier control 
polygonal surface and 
wireframe model of its cubic 
Bezier patch 


Fig. 7.33 (a) Polygonal elements along a common edge must be coplanar to ensure first-order 
continuity. (b) Control polygonal surfaces joined together to form a closed surface. (c) The 
resulting Bezier patches have first order continuity 


Since Eq. 7.78 defines a convex combination of the control points, the Bezier 
surface patch lies within the convex hull of the control points. Another important 
property useful in computer graphics is the affine invariance of Bezier patches. For 
any affine transformation given by a matrix T, the transformed Bezier surface can 
be obtained as 


n m 


P(u,v) =T Pv) = 3 Y Bi Q0; (V TP j+) (7.81) 


i=0 j=0 


which shows that the transformed patch can also be obtained by computing the 
Bezier surface of the transformed control points. 

When several Bezier patches are joined together to form complex shapes, it 
becomes necessary to have at least first-order geometric continuity along the edges 
where two patches join. A sufficient condition for meeting this requirement is the 
co-planarity of polygonal elements of the corresponding control surfaces that share 
a common edge (Fig. 7.33a). 

The surface of the Utah Teapot is specified using 32 control polygonal surfaces. 
A section of the main body consisting of four control surfaces is shown in Fig. 7.33b. 
The bold lines show the edges where the surfaces meet. The Bezier patches form a 
continuous surface as shown in Fig. 7.33c. 
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7.12 Summary 


This chapter has outlined some of the fundamental curve and surface generation 
techniques used in computer graphics. Polynomial interpolation curves of high 
orders do not provide the flexibility and shape control needed in many applications. 
Piecewise cubic curves provide a computationally simple solution where convex 
combinations of four control points are generated using a set of blending functions. 
When a set of piecewise curves are joined together, parametric continuity at the 
points where the curve segments meet becomes important. Tangential continuity is 
generally achieved by adding end-point constraints for the first order derivatives. 
Hermite curves, cardinal splines and cubic Bezier curves are all generated in this 
fashion. Piecewise curves with higher order continuity can be generated using B- 
splines. Rational Bezier curves and rational B-spline curves are constructed using 
the homogeneous coordinate representation of the control points. 

This chapter has also introduced important spline based surface design tech- 
niques using blending polynomials in two independent parameters. Bi-cubic surface 
patches can be seamlessly joined together to form complex three-dimensional 
shapes. 


7.13 Supplementary Material for Chap. 7 


The section Chapter7 /Code on the companion website contains the following 
programs demonstrating the curve and surface generation techniques discussed in 
this chapter. 


]. PolyInterp.cpp 


Additional files: 
none 


The program generates an interpolating polynomial curve (Sect. 7.1) through 
a set of points. The points are specified using mouse input (left button). The 
maximum number of points (and hence the maximum order of the polynomial) 
is set to 7. After defining the points, press ‘p’ to draw the polynomial curve, or 
‘œ’ to clear the screen and start over again. 
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2. CatmullRom.cpp 


Additional files: 


none 


The program generates a Catmull-Rom spline (Sect. 7.5) through a set of 
points interactively specified using mouse input (left button). The curve is 
updated as and when a new point is input. The tangent directions at each input 
point are also shown. A point's position can be changed by clicking on it and 
dragging it with the right mouse button pressed. Press ‘c’ to clear the screen and 
start over. 


3. Bezier2D.cpp 


Additional files: 
none 


The program uses the OpenGL evaluator functions for drawing a two- 
dimensional Bezier curve (Sect. 7.6) for a given set of points specified inter- 
actively using mouse input. The control polygonal line and the corresponding 
Bezier curve are updated as and when a new point is input. Press ‘c’ to clear the 
screen and start over. 


4. Bezier3D.cpp 


Additional files: 
teapot.dat 
teacup.dat 
teaspoon.dat 


The program uses OpenGL evaluator functions to generate three-dimensional 
Bezier patches for the control polyhedra stored in an input file. The program 
reads three input files that contain polyhedral data for the Utah teapot, teacup and 
teaspoon. Select ‘1° for the teapot, ‘2’ for the teacup and ‘3’ for the teaspoon. 
Pressing the space bar toggles between the displays of the control polyhedral 
surface and the Bezier surface. Pressing *n' increases the number of subdivisions. 
The arrow keys are used to change the view direction. 
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5. Bicubic.cpp 


Additional files: 
boundary.dat 


The program generates a bi-cubic patch (Sect. 7.9) on a set of 4 control points. 
The control points and the boundary conditions for the patch are read in from the 
file *boundary.dat". Use left or right arrow keys to change the view direction. 


6. Coons.cpp 
Additional files: 


CurveCoeffs.dat 


The program generates a Coons patch (Sect. 7.10) using four parametric 
curves. The curves Ci(u), C2(u), Di(v), D2(v) are specified using the coefficients 
stored in the file *CurveCoeffs.dat". Use left or right arrow keys to change the 
view direction. 


7. SurfRevln.cpp 


Additional files: 
none 


The program uses OpenGL evaluator functions to generate a two-dimensional 
NURBS curve through a set of user defined points. The points are interactively 
specified using mouse clicks. Pressing ‘s’ key generates a surface by revolving 
the curve about the y-axis. Press ‘c’ to clear the screen and start over again. 
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7.14 Bibliographical Notes 


Curve and surface design techniques are generally discussed in detail in text books 
on computer-aided design and geometric modelling (e.g., Farin (2001), Goldman 
(2009), Olfe (1995)). Some computer graphics texts also give an excellent coverage 
of the mathematical and implementation aspects of spline curves and parametric 
surfaces (e.g., Buss (2003), McConnell (2006), Watt and Watt (1992), Salomon 
(2006)). 

The notions of parametric and geometric continuity are clearly explained in the 
fundamental paper by Barsky and Tony (1989). Surface construction techniques 
using NURBS, Coons patches and ruled surfaces are covered in Piegl and Tiller 
(1997). A comprehensive analysis of rational Bezier curves and surfaces, and 
quadric surfaces can be found in Farin (1999). Bezier and B-spline curves and 
surfaces are also discussed in detail in Prautzsch et al. (2002). 
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Chapter 8 
Mesh Processing 


Overview 


In computer graphics applications, three-dimensional models are almost always 
represented using polygonal meshes. A mesh in its simplest form consists of a set 
of vertices, polygons, and optionally a number of additional vertex and polygonal 
attributes. The complexity of a mesh can vary from low to very high depending 
on requirements such as rendering quality, speed and resolution. A wide spectrum 
of mesh processing algorithms is used by graphics and game developers for a 
variety of applications such as generating, simplifying, smoothing, remapping and 
transforming meshes. Several types of data structures and file formats are also used 
to store mesh data. 

This chapter discusses the geometrical and topological aspects related to three- 
dimensional meshes and their processing. It also presents important data structures 
and algorithms used for operations such as mesh simplification, mesh subdivision, 
planar embedding, and polygon triangulation. 


8.1 Mesh Representation 


A polygonal mesh is a set of vertices and polygonal elements that collectively 
define a three-dimensional geometrical shape. The simplest mesh representation 
thus consists of a vertex list and a polygon list as shown in Fig. 8.1. Polygons are 
often defined in terms of triangular elements. Since triangles are always both planar 
and convex, they can be conveniently used in several geometrical computations such 
as point inclusion tests, area and normal calculations and interpolation of vertex 
attributes. 

The vertex list contains the three-dimensional coordinates of the mesh vertices 
defined in a suitable coordinate frame, and the polygon list contains integer values 
that index into the vertex list. An anticlockwise ordering of vertices with respect 
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Triangle List 
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Fig. 8.1 A cube and its mesh definition using vertex and polygon lists 
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Fig. 8.2 The cut-open view of the cube in Fig. 8.1 showing its representation as a triangle strip 


to the outward face normal direction is commonly used to indicate the front facing 
side of each polygon. The distinction between the front and the back faces of a 
polygon becomes important in lighting computations and culling operations. If the 
polygon list represents a set of connected triangles as in Fig. 8.1, a more efficient 
and compact data structure called a triangle strip may be used. The first three indices 
in a triangle strip specify the first triangle. The fourth index along with the previous 
two indices represents the second triangle. In this fashion, each remaining index 
represents a triangle that is defined by that index and the previous two indices. 

The representation of a cube as a triangle strip is given in Fig. 8.2. The triangle 
strip is decoded as the set of 12 triangles {012, 123, 237, 371, 715, 154, 547, 
476, 762, 624, 240, 401}. Note that the orientation of triangles alternates between 
clockwise and anticlockwise in this representation. The change of orientation is 
corrected by reversing the direction of every alternate triangle in the list, starting 
from the second triangle. Thus the above list would be correctly interpreted as {012, 
213, 237, 731, 715, 514, 547, 746, 762, 264, 240, 041}. If the first triangle is defined 
in the anticlockwise sense, then all triangles in the corrected list will have the same 
orientation. 

Several file formats are used in graphics applications for storing and sharing mesh 
data. A number of such file formats represent values in binary and compressed forms 
for minimizing storage space. In this section, we review some of the popular ASCII 
file formats that allows easy viewing and editing of mesh data. The Object (.OBJ) 
format was developed by Wavefront technologies. This format allows the definition 
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Box 8.1 OBJ File Format 


Comments start with the symbol # 
e.g.; # 3D Model definition 


A vertex definition starts with the symbol v and is followed by 3 or 4 floating 
point values. Each vertex is implicitly assigned an index. The first vertex has 
an index 1. 
CP v -153 2.06 3-02 

w 1$.96 -2.,9 95.0069 3:50 


Texture coordinates are specified by the symbol vt followed by two or three 
floating point values in the range [0, 1]. Texture coordinates are mapped to 
vertex coordinates through the face (‘f’) command. The first set of texture 
coordinates have an index 1. 
Gu. Wwe 0.25 0.90 

wie WO O55 ORS) 


Vertex normals are specified using the vn command. The normal components 
are assigned to a vertex through the face (‘F ) command. The first set of normal 
components is assigned an index 1. 


(Es Wa  —0.256G6 (0.129688  -0).75G 
A polygon definition uses a face command that starts with the symbol f 
and followed by a list of positive integers that are valid vertex indices. 


eo 8 3 6 
E i} & 1 22 


The above face command has a more general form f v/vt/vn v/vt/vn v/vt/vn 
. that can be used to combine texture and normal attributes with vertices. 
Both/vt and/vn fields are optional. 


ep, £ 2/3/11 35/2 6/17 
f 15/2 8/3 1/5 22/9 
f 6//1 vle afia 


The first example above defines a triangle including references to the texture 
and normal coordinates at the vertices. The second example attaches only 
texture coordinate references to each vertex, while the third example uses 
only the normal vectors. 


of vertices in terms of either three-dimensional Cartesian coordinates or four- 
dimensional homogeneous coordinates. Polynomials can have more than three 
vertices. In addition to the basic set of commands supporting simple polygonal mesh 
data (Box 8.1), the .OBJ format also supports a number of advanced features such as 
grouping of polygons, material definitions and the specification of free-form surface 
geometries including curves and surfaces. 
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Box 8.2 OFF File Format 

The first line should contain the header keyword OFF 

This line can be followed by optional comment lines starting with the 
character # 

eg. # Modell file for a cube 


The first non-comment line should have three integer values n,, nf, ne denoting 
the total number of vertices, faces and edges. The number of edges (ne) is 
always set to 0 
Gu, 8 GO ( 
The above line is followed by the vertex list. The number of vertices in the list 
must match the number n,. The first vertex is assigned the index 0, and the 
last vertex the index n,—1. 
CDL EE DEUS 

Ig Al =1,9 il, 7 


Vertices can also be specified using four coordinates in homogeneous form. 
In this case, the header keyword should be changed to 4OFF. 
The vertex list is followed by the face list. Each line contains a set of 


integers n, ij, i2, ...i,, where the first integer n gives the number of 
vertices of that face and the remaining integers give the face indices. 
GE 3 2 0 1 


a OA 55S OSA Waly 


Color values in either RGB or RGBA representation can be optionally added 
to each face as 3 or 4 integer values in the range [0, 255] or floating point 
values in the range [0, 1]. 
Gus 3 d 0 5 ADS ADS (9 0 

s — 35 26 ve 8 Os Oct (y E 


The Object File Format (.OFF) is another convenient ASCII format for storing 
3D model definitions. It uses simple vertex-list and face-list structures for specifying 
a polygonal model. Unlike the .OBJ format, this format does not intersperse 
commands with values on every line, and therefore can be easily parsed to extract 
vertex coordinates and face indices. This format also allows users to specify vertices 
in homogeneous coordinates, faces with more than three vertex indices, and optional 
colour values for every vertex or face (Box 8.2). 

The Polygon File Format (.PLY) also organises mesh data as a vertex list and 
a face list with the addition of several optional elements. The format is also called 
the Stanford Triangle Format. Elements can be assigned a type (int, float, 
double, uint etc.), and a number of values that are stored against each element. 
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Box 8.3 PLY File Format 
The first line in the header should contain the keyword ply. The second line 


specifies the file format using the format keyword. 
e.g., format ascii 1.0 


Comments begin with the keyword comment 
elon comment Model definition for a cube 


The total number of vertices, polygons etc. in the model definition is specified 
using the element keyword. 
Gf, element vertex 8 

Gllewneime race 6 
The type of each element is specified using the property key- 
word. The following commands specify the types of vertex coordinates. 
eg, property fioatee 

property float y 

property float z 


The polygon data is usually defined using a set of vertex indices. The type 
specification is included in the header as 


property int vertex_index 


The keyword end_header is used to delimit the header information. The 
vertex and face lists follow this keyword. The first vertex has the index 0. 
e.g., 
end header 
0.5 C.5 0.5 
1s Oss 19.55 


0123 
21 16) E 55 


Such information is specified using a list of properties as part of the header 
(Box 8.3). This file format supports several types of elements and data, and the 
complete specification is included in the header. Parsing a PLY file is therefore 
considerably complex than parsing an OBJ or OFF file. 


8.2 Polygonal Manifolds 


The model definition files introduced in the previous section contain information 
about vertices, polygons, colour values, texture coordinates and possibly many other 
vertex and face related attributes that collectively specify the mesh geometry. As 
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Fig. 8.3 Examples of manifold meshes 


Fig. 8.4 Examples of 
non-manifold meshes 


seen from the examples, list based mesh definitions often do not store any neigh- 
bourhood or connectivity information. The adjacency and incidence relationships 
between mesh elements define the topology of the mesh and are heavily used by 
several mesh processing algorithms. This section introduces some of the general 
and desirable topological characteristics of meshes. 

A common assumption in the construction of mesh data structures and related 
algorithms is that the given mesh is a polygonal manifold. A polygonal manifold is 
defined as a mesh that satisfies two conditions: (1) no edge is shared by more than 
two faces, and (ii) the faces sharing a vertex can be ordered in such a way that their 
vertices excluding the shared vertex form a simple chain (Fig. 8.3). 

A non-manifold mesh may contain edges shared by more than two polygons, 
or vertices with more than one chain of neighbouring vertices (Fig. 8.4). In a non- 
manifold mesh, the neighbourhood of a point may not be topologically equivalent 
to a disc, which makes local adjustments surrounding that vertex difficult in many 
mesh processing algorithms. The methods discussed in this chapter assume that the 
given mesh satisfies the conditions of a polygonal manifold. 

The chain of vertices surrounding a vertex in a polygonal manifold is closed if the 
vertex is an interior vertex, otherwise the vertex is a boundary vertex. In a triangular 
mesh, the triangles sharing a common vertex form a closed triangle fan for interior 
vertices, and an open triangle fan for boundary vertices (Fig. 8.5). An interior vertex 
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Fig. 8.5 An interior and a 
boundary vertex on a 
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Fig. 8.6 One-ring and two-ring neighbours of a vertex on a manifold mesh 


is also commonly called a simple vertex. A closed manifold that does not contain 
any boundary vertices is called a polyhedron. 

Two vertices are adjacent if they are connected by an edge of a polygon. As seen 
in Fig. 8.5, the set of vertices that are adjacent to a vertex v in a closed manifold 
forms a ring. This set is called the one-ring neighbourhood of the vertex v. The 
union of one-ring neighbourhoods of every vertex in this set is called the two-ring 
neighbourhood of v (Fig. 8.6). 

The orientation of the faces of a polygonal manifold is determined by the way 
in which its vertices are ordered. An anticlockwise ordering of vertices generally 
corresponds to the front face of a polygon. If two adjacent faces have the same 
orientation, they are said to be compatible. In this case, a common edge will have 
opposite directions in the two faces that share the edge (Fig. 8.7a). If every pair of 
adjacent faces is compatible, the mesh is said to be orientable. 

The number of edges incident on a vertex is called its valence. A mesh in which 
every face has the same number of edges, and every vertex has the same valence 
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Fig. 8.7 (a) Compatible faces in an orientable mesh. (b) The Mobius strip is an example of a 
non-orientable mesh 


is called a regular mesh. The number of vertices (V), edges (E), and faces (F) in a 
closed polygonal mesh are related by the Euler-Poincare formula 


V+F-E=2(1-g) (8.1) 


where g, the genus, denotes the number of holes/handles in the mesh. The right- 
hand side of the above equation is called the Euler Characteristic. For the torus 
in Fig. 8.3 and the Möbius strip in Fig. 8.7b, g = 1, and hence + F = E. For 
polyhedral objects without any holes, V + F = E + 2. This equation is generally 
referred to as the Euler's formula. In a triangular mesh without holes, the average 
valence of a vertex is six, and we can get an estimate of the number of faces and 
edges in terms of the vertices as 


Fr2v 
E x 3V (8.2) 


Also in a triangular mesh, every face has three edges, and every edge is counted 
twice while counting the number of faces. Therefore the number of faces and edges 
are connected by the equation E = 3 F/2. 


8.3 Mesh Data Structures 


Mesh data structures are designed to provide information about both mesh geometry 
and topology so that they could be used for fast traversal and processing of 
meshes. A large number of mesh operations extensively use information about mesh 
connectivity and local orientation around vertices. Mesh data structures also support 
efficient processing of incidence and adjacency queries. In this section, we consider 
one face-based and two edge-based data structures. 
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Pi 
struct Triangle 
{ 
Vertex *pl, *p2, *p3; 
Triangle *tl, *t2, *t3; 
F3 
P, P3 


Fig. 8.8 A face based data structure for a triangle showing references to its neighbouring faces 


Fig. 8.9 Traversal of the 

one-ring neighbourhood of a pow 
vertex using a face-based data 

structure RE d 


8.3.1 Face-Based Data Structure 


Face-based data structures are primarily used for triangular meshes where both the 
number of edges and number of vertices per face have a constant value 3. In an 
ordinary mesh file, each triangle is defined using the indices of its three vertices. 
A face-based data structure additionally stores references to its three neighbouring 
triangles (Fig. 8.8). Because of its simple structure, a face data structure can be 
easily constructed from a vertex list and a face list. This data structure does not 
store any edge related information, and hence is not particularly suitable for edge 
operations such as edge collapse, edge flipping or edge traversal. 

Assuming that every polygonal face in a mesh is a triangle, the face-based data 
structure provides a convenient mechanism to obtain information about all triangles 
surrounding a vertex. Using this information, we could perform the traversal of the 
one-ring neighbourhood of a vertex in constant time. The inputs for the algorithm 
are a vertex v and a triangle containing that vertex. The algorithm iteratively visits 
the neighbouring triangles, each time checking if the triangle has v as one of its 
vertices and has not been visited previously. In Fig. 8.9, the triangles indicated by 
dotted arrows are not visited as they do not have v as a vertex. The vertices of the 
visited triangles are added to the set of one-ring neighbours of v. A pseudo-code of 
this method is given in Listing 8.1. 
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Listing 8.1 Pseudo code for the one-ring neighbourhood traversal 
algorithm 


1. Input: v, face //The triangle has v as a vertex 

2 S = {} //Solution set 

3. Add vertices of face other than v to S 

4 t start = face //Starting triangle 

5 t previous - null 

6 t current - a neighbour of face different from 

t previous, which has v as a vertex 

if (t current -- t start) STOP 

8. Add vertices of t current other than v, and not 
already in S, to S 

9. t previous - face 

10. face - t current 

11. GOTO 6 


-1 


Table 8.1 Components of the wing-edge structure for the same edge in opposite 
directions 

Edge start end left right  leftprev  leftnext  rightprev right. next 
PQ P Q L R a b c d 

QP Q P R L c d a b 


8.3.2 Winged-Edge Data Structure 


The winged-edge data structure is one of the powerful representations of an 
orientable mesh that could be used for a variety of edge-based query processing and 
manipulation of a mesh. In this representation, each face has a clockwise ordering 
of its vertices and edges. The structure stores several interconnected information 
pertaining to the neighbourhood of every edge in the form of three substructures: an 
edge table, a vertex table and a face table. 

An edge PQ and its adjacent faces are shown in Fig. 8.10. The direction of the 
edge is specified by the start and end vertices, and it enables us to define the left 
and right sides of the edge. The corresponding references to the polygon L on its 
left, and R on its right are stored. The edge structure also stores the preceding and 
succeeding edges of PQ with respect to each of these faces. The preceding edge on 
the left is the edge a, and the succeeding edge on the left is the edge b. Similarly, the 
preceding edge on the right is c, and the succeeding edge on the right d. Note that 
on each face, a clockwise ordering of the edges is used. Table 8.1 shows how the 
component values change when the direction of the same edge is reversed. 

The winged-edge structure also requires two additional tables or structures, as 
shown in Fig. 8.10. The vertex table stores the coordinates of each vertex and one of 
the edges incident to that vertex. The face table maps each face to one of the edges 
of that face. These tables provide the entry points to the edge structure via either a 
vertex or a face. For example, if we are required to find all edges that end at a given 
vertex v, we first use the vertex table to find one of the edges incident at v, and then 
use the winged-edge structure to iteratively find the remaining edges. Care must be 
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struct W edge 


{ 
Vertex *start, *end; 


t 


Face *left, *right; Q unn 

W edge *left prev,  *left next; i 
. W_edge *right prev, *right next; 

; Left Face ! 
struct Vertex à 
Í 
float x, y, zi M 
W edge *edge; P 


F 


struct Face 


W edge *edge; 


Fig. 8.10 The winged-edge data structure 


edge-?^right next edge->left_ next 


edge- v-»edge edge- v-»edge 


Fig. 8.11 Computation of all edges incident at a vertex. Both directions of an edge should be 
considered in algorithms using the winged-edge data structure 


Listing 8.2 Pseudo code for finding all edges through a vertex in anti- 
clockwise order 

T Input: v //A vertex 

2 W edge *e0 = v-»edge; //Initial edge 
3 W edge *edge = e0; 

4 do 

5. f 

6 if(edge-»end -- v) edge - edge -» right next; 
7 else edge = edge -> left next; 

8 output (edge) ; 

9 } while (edge != 60); 


taken to use the right orientation of an edge; the edge entry for a vertex v in the 
vertex table may have v as the either the start vertex or the end vertex. Similarly an 
edge in the face table may have the face as either its left face or the right face of 
the edge. 

The algorithm to find all edges incident at a vertex v considers both the cases 
discussed above, and enumerates the edges surrounding v in an anticlockwise order 
(Fig. 8.11). The pseudo-code for the algorithm is given in Listing 8.2. 
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Listing 8.3 Pseudo code for finding all faces that share a vertex in anticlockwise 


order 
1. Input: v //A vertex 
2. Wedge *e0 = v->edge; //Initial edge 
3. W edge *edge = e0; 
4. do 
5s. d 
6. if(edge-»end -- v) 
T. { output(edge-»right); edge = edge-»right next; } 
Bis else 
EM { output(edge-»left); edge = edge-»left next; } 
10. } while (edge != e0); 
face->edge face->edge 


edge->right_prev edge-»left prev 


Fig. 8.12 Computation of edges around a polygonal face 


Listing 8.4 Pseudo code for finding all edges of a face in anticlockwise order 


1. Input: face 

2. Wedge *e0 = face-»edge; //Initial edge 
3. Wedge *edge = e0; 

4. do 

95. 1 

6. if(edge-»right == face) edge = edge-»right prev; 
Y else edge = edge-»left prev; 
8. output (edge); 

9. } while (edge != e0); 


A slight modification of the above algorithm can yield a method to output all 
faces sharing a common vertex v in an anticlockwise order (Listing 8.3). 

The algorithm to compute all edges of a given polygonal face in anticlockwise 
order uses an approach similar to the ones given above. The iteration starts from the 
initial edge retrieved from the face table, and proceeds to the next edge based on the 
orientation of the current edge (Fig. 8.12). The pseudo-code for the algorithm is in 
Listing 8.4. 


6.3.3 Half-Edge Data Structure 


The algorithm implementations discussed in the previous section show a limitation 
of the winged-edge data structure — the ambiguity regarding the direction of an edge 
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struct H edge 
{ 


Vertex *vert; 

Face *face; 

H edge *prev, *next ; 
. H edge *pair; 


struct Vertex 


float x, y, z; 
H edge *edge; 


struct Face 


H edge *edge; 


Fig. 8.13 The half-edge data structure 


will need to be resolved every time an edge is processed, and this is commonly done 
using an if-else block to deal with the two possible directions of every edge. The 
half-edge data structure resolves the ambiguity by splitting every edge and storing it 
as two half-edges, each with a unique direction. A half-edge belongs to only a single 
face, which is the face on its left side. A half-edge structure stores references to the 
unique vertex the edge points to, the unique face it belongs to, the successor of the 
edge belonging to the same face, and the pair of the half-edge having the opposite 
direction and belonging to the adjacent face (Fig. 8.13). The half-edge structure is 
essentially a doubly linked list and hence is also known as the Doubly Connected 
Edge List (DCEL). 

The components of the half-edge PQ in Fig. 8.13 are the references to the 
ending vertex Q, the face L on its left side, the next edge b on the same face, 
and the pair which is the half-edge QP in the opposite direction. Edge processing 
algorithms often use references to the previous edge (e.g., the method shown in 
Fig. 8.14), and this information may also be stored in the edge structure. As in 
the case of the winged-edge structure, two additional tables/structures are used 
to obtain a half-edge from either a vertex or a face. The vertex table contains for 
each vertex, its coordinates and a half-edge incident at that vertex. The face table 
contains for each face, a half-edge that belongs to that face. From the definition 
of the half-edge structure, it is clear that for a given half-edge edge, the end and 
start points are given by edge->vert, and edge->pair->vert respectively. 
Similarly, the two faces that border an edge are given by edge-^ face and 
edge->pair->face. 

We will now consider the algorithm for computing all edges incident at a given 
vertex v. Using a half-edge data structure, the edges can be unambiguously retrieved 
using a simple iteration (Fig. 8.14). The modified version of the pseudo-code in 
Listing 8.2 using the half-edge data structure is given in Listing 8.5. In this case, the 
algorithm is simpler without any case distinction, and returns edges that end at the 
given vertex in anticlockwise order. 
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edge->next->pair 


edge->pair->prev 
edge= v->edge g P P edge- v-»edge 


Fig. 8.14 Computation of incident edges at a vertex in anticlockwise and clockwise orders using 
the half-edge structure 


Listing 8.5 Pseudo code for finding all edges that end at a vertex in 
anticlockwise order 


1 Input: v //A vertex 

2 H edge *e0 = v-»edge; //Initial edge 
3 H edge *edge = e0; 

4. do 

5 { 

6 edge = edge -> pair -> prev; 

q output (edge) ; 

8 } while (edge != e0); 


Listing 8.6 Pseudo code for finding all faces adjacent to a face 


Input: face 

H edge *e0 = face-»edge; //Initial edge 
H edge *edge = e0; 

do 


{ 
edge = edge -> next; 
output (edge->pair->face) ; 
} while (edge != e0); 


CO -1 Oy O1 i: C0 IO ES 


In Listing 8.5, if we replace the output statement with output 
(edge->pair->vert), we get all the vertices in the one-ring neighbourhood 
of the given vertex. Likewise, the method with output (edge->face) returns 
all faces that share the vertex. The half-edge data structure provides a convenient 
tool for enumerating all faces that are adjacent to a given face (Listing 8.6). 

An edge data structure links together adjacency information pertaining to 
vertices, faces and neighbouring edges. The removal of an edge from a polygon 
calls for an update of this information by way of readjusting references to the deleted 
edge and adjacent faces. As an example, if the edge PQ of the polygonal face shown 
in Fig. 8.15 is removed, several edges along the boundary of the resulting polygon 
will need to be updated. Listing 8.7 provides the list of “tidy-up” operations required 
after removing PQ. Even though the edges marked ‘a’, ‘b’, ‘c’, ‘d’ in Fig. 8.15 can 
be indirectly referenced through either e; or e», separate variable declarations for 
each of these edges are used in Listing 8.7 for better clarity. 
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Fig. 8.15 Readjustments to pointers/references are required when an edge is removed 


Listing 8.7 Pseudo code for finding all edges through a vertex in anticlockwise 
order 


Input: el //The edge and its pair to be removed 
e2 = el->pair; 

£l el-»face; 

f2 e2->face; 

= el->prev; 

b = el-»next; 

c = e2-»next; 

d = e2-»prev; 

q = el->vert; 
p 
a 
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= e2-»vert; 

-2next = c; 
c-»prev = a; 
b->prev = d; 
b; 
p->edge = a; 
q->edge = d; 
if(fl-»edge == el) fl->edge = a; 
if (p->edge == e2) p->edge = a; 
if(q->edge == el) q->edge = d; 
edge = a; 
while (edge != d) 
{ 


Mo 


edge = edge-»next; 
edge->face = f1; 
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We now consider the inverse of the process discussed above, where a new edge 
is introduced into a polygon, splitting the polygon into two separate polygons. This 
process is commonly used for incrementally triangulating an arbitrary polygon. 
With reference to Fig. 8.15, the sequence of operations required for adding a new 
edge PQ is given in Listing 8.8. 

In the following sections, we consider more complex mesh processing algorithms 
that use different types of adjacency information. 
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Listing 8.8 Procedure for adding a new edge PQ to a polygon 


1. Input: p, q, El //Two non-adjacent vertices 

Be Enumerate edges ending at p, and find the edge ‘a’ 
that has fl as its face. 

3 Enumerate edges ending at q, and find the edge ‘d’ 


that has fl as its face. 


4. c = a-»next 
De b = d-»next 
6. Create 2 new half-edges el, e2 
da Create a new face f2 
8. el->vert = q; e2->vert = p; 
9, el->prev = a; e2-»prev = d; 
0. el-»next = b; e2-»next = c; 
EM el->face = fl; e2->face = f2; 
ipo el-»pair = e2; e2->pair = el; 
3. f2-»edge = d; 
4. a-»next = el; d->next = e2; 
Dis b->prev = el; c-»prev = e2; 
6, edge = o; 
es do 
8. { 
9, edge->face = £2; 
20. edge = edge->next; 
2T. } while (edge != e2); 


8.4 Mesh Simplification 


Mesh simplification algorithms aim to reduce the geometric complexity of a mesh 
without altering the essential shape characteristics. These methods are designed to 
take meshes containing a large number of polygons and convert them into meshes 
with a relatively smaller number of polygons. Mesh simplification is commonly 
used in the construction of level-of-detail representations of objects with a high 
polygon count. Most of the algorithms try to preserve the topology of the mesh by 
making sure that the resulting mesh has the same Euler characteristic (Eq. 8.1). In 
this section, we outline two important methods based on the local simplification 
strategy that progressively remove vertices or edges until the required level of 
simplification is achieved. In general, simplification methods will use a cost function 
to select the most appropriate vertex or edge for removal, and also have a set of 
constraints which the selected item is required to satisfy. 


8.4.1 Vertex Decimation 


The vertex decimation algorithm iteratively removes vertices from a triangular 
mesh, at the same time trying to preserve the topology and the shape of the 
original mesh. When a vertex is removed, its one-ring neighbourhood will need 
to be re-triangulated. The selection of a vertex for removal is generally based on a 
decimation criterion that ensures that important shape features of the mesh are not 
affected. 
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Av age Plane 


Fig. 8.16 Definition of the average plane of a set of triangles sharing a vertex 


One of the commonly used criteria for vertex decimation is the near-planarity 
of the neighbourhood of a vertex. A nearly planar region could be represented 
by a few large triangular elements covering the region instead of several small 
triangles. Consider an interior vertex v surrounded by a closed triangle fan as shown 
in Fig. 8.16. The planarity of the surface region around the vertex can be measured as 
a distance d of the vertex from an average plane of its neighbourhood. The average 
plane is formed by the local area-weighted average of the surface normal vectors 
n; and the centroids p; of all triangles sharing the vertex v. If there are k triangles 
that have a common vertex v, and if A;, n;, p; denote the area, normal vector and the 
centroid respectively of the i" triangle, then the area-weighted average normal and 
point are computed as follows: 


k 
b3 Aini 


Hayg = = (8.3) 


2; Ai 


i=l 


k 
i Ai pi 


Pavg = ED IE (8.4) 
2 Ai 
i=l 

The average plane is then defined as the plane passing through the point paz, 
having a normal direction nayg. Its equation can be obtained as given in Eq. 2.21, 
and the shortest distance D of the vertex v to the plane can be computed using Eq. 
2.24. The value of D can be used as the cost function for selecting a vertex. 

If v is a boundary vertex, the deviation of the boundary segments containing v 
from a straight line can be used as the error function (Fig. 8.17). This is measured 
as the shortest distance D of the vertex from an imaginary line connecting the two 
opposite neighbours of the vertex along the boundary. These neighbouring vertices 
are connected to v by edges that have only one bordering face, and can be identified 
using either a winged-edge or half-edge based ring traversal algorithm. The shortest 
distance D can then be obtained using Eq. 2.16. D is the altitude of the triangle 
formed by v and its two neighbours (Fig. 8.17). 
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Fig. 8.17 The error metric for a boundary vertex can be defined as the distance of the vertex from 
the dotted line connecting its neighbours on the boundary 


The vertex decimation algorithm uses a greedy approach, iteratively selecting 
the vertex with the current minimum value of the error metric D for decimation. 
An upper threshold for D prevents all vertices with values greater than the threshold 
from being deleted. When a vertex is removed, all edges that end at the vertex are 
also removed and the components of the edges of the resulting polygon are adjusted 
as previously shown in Fig. 8.15 and Listing 8.7. Two important steps remain in 
the vertex decimation process before proceeding to the next iteration where another 
vertex is chosen: the polygon resulting from the removal of the current vertex must 
to be triangulated (Fig. 8.18), and the error metrics for its vertices must be updated. 
The one-ring neighbourhood of the deleted vertex will in general form the boundary 
of a star-shaped polygon. Algorithms for the triangulation of such polygons are 
discussed later in this chapter. Convex polygons are special types of star-shaped 
polygons where every internal angle is at most 180°. Convex polygons can be 
easily triangulated from any vertex, but such a triangulation may not always give 
the optimal value for the minimum angle of the triangles. 


6.4.2 Edge Collapse Operation 


An edge collapse is a relatively simpler operation compared to vertex decimation. 
Here, a local curvature based cost function is associated with every edge, and used 
for selecting an edge for removal. The edge and its two incident faces are removed 
by moving one of the edge's end points towards the other, and deleting the second 
vertex. The result of an edge collapse operation is illustrated in Fig. 8.19, where the 
edge PQ is collapsed and the vertex Q deleted. Note that the new position of P may 
in general be somewhere in between the original positions of P and Q. Commonly 
adopted methods for positioning P are: (a) keep P in its original position, (b) move 
P to coincide with Q (c) move P to the midpoint of PQ. 

An edge collapse operation can be implemented using an edge-based data 
structure such as the winged-edge or the half-edge. We use the half-edge struc- 
ture, as it helps in minimizing the amount of restructuring operations needed to 
the surrounding mesh elements. All references to the vertex P are retained, while 
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Fig. 8.18 Removal of internal and boundary vertices and the triangulation of the resulting 


Fig. 8.19 An edge collapse operation performed by moving the vertex P towards Q 


those to Q are replaced with P. The sequence of steps required by the edge collapse 
operations are given in Listing 8.9. The references to edges and faces used in the 
code are shown in Fig. 8.20. The new position of P is indicated by the point P’. 

An interesting aspect of the edge collapse operation is that it is totally invertible. 
With reference to Fig. 8.20b, given the original positions of P, Q and also the 
locations of R, S, we can reconstruct the edge PQ and its two adjacent triangles 
as in Fig. 8.20a. The inverse process is called the vertex split operation. 

The main topological restrictions used by the edge collapse algorithm in selecting 
edges are shown in Fig. 8.21. The bottom row of the figure shows the result of the 
edge collapse operation in each of the following cases: 


(a) The edge belongs to a triangle whose other two edges are boundary edges. 
Collapsing this edge results in a topologically inconsistent configuration that 
contains an isolated vertex. 
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Fig. 8.20 Edge references used by the edge collapse algorithm in Listing 8.9 


Listing 8.9 Procedure for collapsing the edge PQ in 
Figs. 8.18 and 8.19. 


a->prev = b2->prev 

a->next = b2-»next 

a->face = b2->face 

c-»prev = d2-»prewv 

c-»next = d2-»next 

c->face = d2->face 

if (p->edge == e2) p->edge = a; 
edge = dls 

do 

{ 
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edge = edge-»pair-»prev; 
edge->vert = p; 


} while (edge != b2); 

b2-»prev-»next = a; 

d2->prev->next = c; 

d2-»next-»prev = c; 

b2-»next-»prev = a; 

b2->face->edge = a; 

d2->face->edge = c; 
b2-»prev-»vert-»edge = b2-»prev; 
c->vert->edge = c; 

Update the position of p as required 


wo 
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(b) Both vertices of the edge are boundary vertices, but the edge is not a boundary 
edge. Collapsing this edge results in a non-manifold vertex. 

(c) The intersection of the one-ring neighbourhoods of vertices P and Q normally 
contains only the opposite vertices A, B of the edge PQ. In the special case 
shown in the figure, the intersection contains vertices A, B, C, D. Collapsing the 
edge PQ removes six faces instead of just two. In a more general case, when 
AC or BD is not perpendicular to PQ, the operation results in the folding of 
triangles. 
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Fig. 8.21 Configurations not suitable for the edge collapse operation 


As with the vertex decimation algorithm, we require a cost function for assigning 
a priority value to edges for removal. The cost function is often designed as a mea- 
sure of the local curvature and represents the geometric error introduced by the edge 
collapse operation. A simple cost function is a linear combination of the dihedral 
angle between the two triangles bordering the edge and the length of the edge: 


Cost(P, Q) = kicos ! (m, e mz) + k2|P — Q| (8.5) 


where mı, m» are the unit normal vectors of the two triangles and kı, kp are user 
specified constants. The computation of inverse cosine in the above equation can be 
eliminated by replacing the function with a mapping of the value of m;* m» from 
the range [—1, +1] to [+1, 0]: 


Cost(P, Q) = ky (=m) +k |P — Q] (8.6) 


The cost function proposed by Melax (1998) uses the product of the edge length 
and the local curvature. The local curvature here is defined as the largest dihedral 
angle between the triangles incident at P and the face of the edge PQ that is on the 
same side as the triangle. The mapping in the above equation is again used as the 
approximation of the dihedral angle. If the unit surface normal vectors of the faces 
incident at P are denoted by n;, i= 1..N, and mı, m» denote the unit normal vectors 
of the two triangles adjacent to the edge PQ, then 


Cost(P, Q) = |P — Ql. max | min (=) (8.7) 


j=l2 2 


Another cost function that has been found particularly useful for edge collapse 
operations is the quadric error metric (QEM). The primary advantage of this method 
is that the error function parameters can be pre-computed for the original vertices of 
the mesh, and later used to obtain the cost associated with any edge PQ. This cost 
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Fig. 8.22 The quadric error metric is defined using the equations of planes incident at each vertex 


function is simply the sum of the error metrics at the end points P, Q evaluated using 
the new position of the vertex P. The following paragraphs describe the computation 
of this cost function. 

Consider an edge PQ as in Fig. 8.22, and assume that this edge is collapsed and 
P moved to a new position P'. The value of the cost function at P' is the sum of 
squares of the distances of P' to the planes adjacent to both P and Q (Fig. 8.22b). If 
nj denotes the unit surface normal vector of a triangle incident at the point P (xy, Yp, 
Zp), then the equation of the plane of that triangle can be written as (see Eq. 2.22) 


aix + biy +ciz+ di = 0, (8.8) 


where (a;, bi, cj) = nj, and d; = —aijxy — biy, — ciz. The shortest distance of a point 
V(xy, yy, Zy) to this plane is given by (see Eq. 2.25) 


Dj(V) = ajxy + biyy + Giz + di = AjTN (8.9) 
where, 
dj X» 
A; = b , and V = n 
Cj Zy 
di 1 


The square of the distance of V to the plane is therefore 
Di (V) = (Ai V (AC V) = V'(AiAj)V (8.10) 
Thus the sum of squares of distances of V to all planes adjacent to P is given by 


DPV) 2 V | >> AjAT | V (8.11) 


ieN, 
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In the above equation N, denotes the set of all triangles incident at P. The right- 
hand side of the equation is a quadratic polynomial, hence the name Quadric Error 
Metric (Garland 1999). The summation within the brackets can be pre-computed for 
every vertex P and stored. We are now in a position to define the cost function using 
QEM. If the edge PQ is removed, and if P' is the new position of P, then 


Cost(P, Q) = D,*(P’) + Do*(P?) (8.12) 


In the next section we consider the inverse problem of mesh simplification, and 
look at a few important subdivision algorithms. 


8.5 Mesh Subdivision 


Mesh subdivision methods increase the polygon density of a mesh by iteratively 
splitting polygons and applying a set of rules for repositioning the vertices. Every 
subdivision step increases the number of edges, vertices and polygons in a mesh 
without grossly distorting the overall shape or topological characteristics. Mesh 
subdivision algorithms are used for geometric modelling of complex surfaces from 
simple coarse meshes through successive refinement, smoothing and approximation. 
Subdivision algorithms provide us the capability to alter the level of detail of a 
polygonal mesh from very coarse to highly tessellated and smooth object models. 
Such methods are therefore also called scalable geometry techniques. 

Before considering subdivision algorithms for polygonal meshes, we review the 
fundamental aspects of iterative polygonal line subdivision. 


8.5.1 Subdivision Curves 


An iterative refinement of a control polygon can be made to converge to a parametric 
curve by suitably defining the transformations associated with points at each level. 
Consider a polygonal line formed using four control points as shown in Fig. 8.23. 
We denote this set as S? = (Po9, P,9, P59, P3°}. The superscript indicates the 
subdivision level, and the subscript the index of the point within the set. At the 
next level, this set is refined into S! = (Po!, P;', ... Ps! by adding a new point 
in between every consecutive pair of points, and also transforming the existing 
points. Points P5;! € S! with an even index correspond to existing points Pj? € 
S? at the previous level, while points with an odd index in S! are newly inserted 
points. Figure 8.23 also shows the next level of subdivision S?. The points at a level 
j+ 1 are generated from the points belonging to the previous level according to the 
following equations, also known as refinement rules: 
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Transformation of existing points: 


l I\ 6\ G Y s 
p e (;) Pi- + (3) pi + (5) Pho i=1,..Nj—2. (8.13) 


Insertion of new points: 
j+ LY udi ; 
Piy = \ 5 | Pi + z)? P= 0,... Nj —2, (8.14) 


where N; is the number of points in S’. The number of points in S+! is then 2N;—1. 
For the example in Fig. 8.23, No = 4, Nj — 7, and M = 13. The end points of the 
polygonal line are kept fixed throughout the subdivision process: 

po = pj. PAN) = PN, (8.15) 

As can be seen from Fig. 8.23, when the level number increases, the set of points 
converges to a continuous parametric curve. 

Figure 8.24 shows how three consecutive points at level j + 1 are computed using 
three points at level j. The dotted lines correspond to the transformation in Eq. 8.13, 
and the solid lines to Eq. 8.14. This correspondence between three points can be 
expressed as the following equation: 


[ue 440 pli 
1 ! 

a |= ged p] (8.16) 
a , 

Pi 044 Pha 


The above transformation matrix can be extended to a 5 x 5 matrix for five con- 
secutive points. In the above example, each of the existing points was transformed 
using a convex combination of three points. The transformations can be further 
generalized using convex combinations of k points: 


3 S! S2 


Fig. 8.23 A control polygonal line and the next two levels of its subdivision 
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Fig. 8.24 Correspondence 
between three consecutive 
points at levels j and j + 1 


k 
JEL gan J 
Pù = b» Qu Dj pu 


u=—k 


k 
phe » dua (8.17) 


u=—k 


The coefficient sets {a,}, {bu} are called subdivision masks. A subdivision mask 
is said to be stationary if its values do not vary with the subdivision level j. Each set 
also forms a partition of unity: 


y dy = 1., p» b, — 1. (8.18) 


In the next section, we extend the concepts outlined above to subdivision 
surfaces. 


8.5.2 The Loop Subdivision Algorithm 


The subdivision of a triangular polygonal manifold can be performed in a manner 
similar to the method given in the previous section, by adding a new vertex at 
the midpoint of each edge, and transforming the existing vertices. The triangular 
subdivision scheme without the coordinate transformation is shown in the Fig. 8.25. 
As the subdivision level increases, the mesh immediately tends to become a regular 
mesh where the valence of every internal vertex is 6. Internal vertices where the 
valence is not equal to 6 are called extraordinary vertices. 

The coordinates of the subdivided mesh vertices are computed using two 
subdivision masks as seen in the previous section. The first mask computes the 
coordinates of a new vertex based on a convex combination of existing neighbouring 
vertices. The second mask transforms every existing vertex using another convex 
combination of its one-ring neighbours. The Loop subdivision scheme is primarily 
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Fig. 8.25 Triangular subdivision scheme 
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Fig. 8.26 Masks used by the Loop subdivision algorithm 


designed for triangular meshes and it uses the subdivision masks shown in Fig. 8.26. 
Correspondingly, the update equations for points at subdivision level j+ 1 can be 
written as follows: 

Insertion of new vertices: 


; 1 ' 3 ; 3 ; 1 ; 
+1 


Transformation of existing vertices: 
pj -—(-2ApL-AY pt, (8.20) 
i=l 


where G;, i= 1,..n are the one-ring neighbours (at level j) of an existing vertex G. 
The factor A is chosen such that 


ic (8.21) 


8.5 Mesh Subdivision 205 


Boundary A B A C 


B ww C (1/8) (6/8) (1/8) 
(1/2) Vertex (02 Od 
Vertex 


Fig. 8.27 Loop subdivision masks for boundary vertices 
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Fig. 8.28 Subdivision of a tetrahedron using the Loop algorithm 


The above condition ensures that the weight (1—nA) assigned to the current 
vertex is greater than the sum of weights nA assigned to its one-ring neighbours. 
For a regular vertex (n = 6), A is given a value 1/16. Equation 8.20 then becomes 


10Y ; 
n (E)r pl + (5) Sa (8.22) 


i=l 


For boundary vertices, the subdivision masks in Fig. 8.26 are appropriately 
modified as shown in Fig. 8.27. 

The Loop subdivision algorithm can be implemented using a model definition 
based on vertex and face lists, and a data structure such as the half-edge for 
obtaining the one-ring neighbours of vertices. The result of the application of the 
Loop subdivision algorithm on a tetrahedral object is shown in Fig. 8.28. Note that 
the original vertices of the tetrahedron are the only extraordinary vertices of the 
mesh as they have a constant valence 3 throughout the subdivision process. All new 
vertices are regular vertices with valence 6. 


8.5.3 Catmull-Clark Subdivision 


The Catmull-Clark subdivision scheme can be applied to meshes with arbitrary 
topology, but unlike the previous method, it produces a mesh that consists primarily 
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O = New face point O - New edge point 
E = Vertices of the face 


A= Updated vertex 


Fig. 8.29 In each iteration of the Catmull-Clark algorithm, new face and edge points are added 
and existing vertex positions are updated 


of quadrilaterals containing vertices of valence 4. In each subdivision step of the 
algorithm, the following mesh operations are performed in a sequence: 


* A new face point is added to each face by computing the average of all vertices 
of the face (Fig. 8.292). In the following equation, j denotes the subdivision level, 
f a face, and ny the number of vertices of that face, and v; the vertices of the face. 
vr denotes the new face point. 


ny 


1 ; 
JFL j 
UE 23 v (8.23) 


e A new edge point is added to each edge by computing the average of the end 
points of the edge and the new face points of the edge's neighbouring faces 
(Fig. 8.29b). In the following equation, the new edge point is denoted by ve. 
The edge has end points v4, vg, and adjacent faces f and g. 

jl j+1 j j 
. v Tv Tv, ctv 
yit o t n BM (8.24) 


* After adding new face and edge points, the position of every old vertex is updated 
as follows. Let n, be the number of edges incident at a vertex v, v; the one-ring 
neighbours of the vertex v, and v; the new face points on the faces surrounding 
v (Fig. 8.29c). Q, and R, denote the average of the new face points and the edge 
midpoints respectively. The superscript j denotes the subdivision level. 
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ny 


The vertex update equation in Eq. 8.25 can be viewed as a convex combination 
of three points Q, R and v’, with weights 0.25, 0.5, 0.25 for a regular vertex. For a 
vertex of valence 3, the weights are 0.33, 0.67 and 0. 

On completion of the steps outlined above, the mesh is re-tessellated. New faces 
and edges are added to the mesh by connecting each new face point to every new 
edge point located around that face (Fig. 8.30a). Insertion of new edge points also 
splits existing edges. Coordinates of existing vertices as well as the definitions of 
edges incident at those vertices are updated (Fig. 8.30b). 

As seen in Fig. 8.30b, the newly added faces are all quadrilaterals, and all new 
edge points will have valence 4. Vertices that have a valence other than 4 after the 
first iteration will continue to have a valence other than 4 in subsequent iterations, 
and will therefore become extraordinary vertices. The Catmull-Clark subdivision of 
a cube is shown in Fig. 8.31. The original vertices of the cube always have a valence 
3, while all other vertices have a valence 4. 


8.5.4 Root-3 Subdivision 


The 4/3-subdivision scheme combines a triangle split operation and an edge flip 
operation to generate a smooth surface from a triangular mesh. The iterative 
algorithm performs the following two steps in every iteration. 
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Fig. 8.31 Catmull-Clark subdivision of a cube 
a 


Fig. 8.32 The ,/3-subdivision algorithm divides every triangle into three triangles and flips every 
old edge 


— Old edges 
— New edges 


e New vertices 


* For each triangle, insert a new vertex at its centroid, and split the triangle into 
three triangles as in Fig. 8.32a. This operation performs the subdivision of 
the mesh, increasing the number of triangles by a factor of three in a single 
subdivision step. The operation also introduces three new edges, one from each 
vertex to the centroid, along the direction of a median. 

* Flip the old edges as shown in Fig. 8.32b. This operation contributes to the 
smoothing of the mesh. 


Applying the subdivision operator twice causes the tri-section of every original 
edge. The method is therefore referred to as the ,/3-subdivision scheme. An edge 
data structure which is both non-recursive and much simpler compared to the 
half-edge structure, containing only references to the incident vertices vı, v2, and 
adjacent triangles fi, f2, is particularly useful for this algorithm (Listing 8.10). The 
edge flip operation can be implemented by simply traversing the edge list, and for 
each edge (edge) creating two new faces (£aceNew1, faceNew2) as in Listing 
8.11. After traversing the list of edges, the old face list is deleted and replaced with 
the list of new faces. 
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Listing 8.10 Data structure for the 4/3-Subdivision algorithm. 


1. struct Edge 

24 - 1 

9x Vertex *vl, *v2; //start and end vertices 
4. Face *fl, *f2; //left and right faces 
Bw um 

6. struct Vertex 

Te “4 

8. float x, y, z; 

9. JF 

10. struct Face 

LT. 4 

12. Vertex *vl, *v2, *v3, *v4; 

135 Vertex *mid; 

1:4. hy 


Listing 8.11 The edge-flip operation. 


Input: An edge from the list of edges 
Output: Two new faces faceNewl, faceNew2 
faceNewl->vl = edge-»v1; 

faceNewl->v2 = edge-»f2-»mid; 
faceNewl->v3 = edge-»fl-»mid; 
faceNew2->vl = edge-»fl-»mid; 
faceNew2->v2 = edge->f2->mid; 
faceNew2->v3 = edge->v2; 


CO —1 OY O1 i: CQ) ND ES 
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Fig. 8.33 Three iterations of the application of 4/3-subdivision algorithm on a cube model 


The application of the ./3-subdivision method on the triangular mesh model of 
a cube is shown in Fig. 8.33. 


8.6 Mesh Parameterization 


Mesh parameterization can be broadly defined as the process of generating a 
mapping of points in a three-dimensional mesh to points belonging to a sim- 
pler parametric domain. A parameterization typically associates a unique two- 
dimensional point to every vertex, thus establishing a mapping from a subset of 
NR? to a subset of R?. The two-dimensional domain could simply be a region of a 
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plane, or in a more general case a set of parametric coordinates defined on another 
surface such as a sphere. The mesh is then said to be parametrically embedded in 
that domain. Mesh parameterization finds several applications in computer graphics 
such as texture mapping, mesh morphing and re-meshing. 

One of the primary goals of parameterization is to achieve a one-to-one and 
invertible mapping (a bijection). Some parameterizations additionally preserve 
angles and areas. Angle preserving mappings are called conformal, while 
area preserving mappings are known as authalic. Triangular meshes that are 
topologically equivalent to a disc have a simple planar parameterization using 
piecewise linear mappings. If we can find a one-one correspondence of the vertices 
Pj = (xi, yi, zi), i= 1..3, of a triangle to points S; = (uj, vi), i= 1..3 in a plane, then 
the map f of any point (x, y, z) within the triangle is given by the linear function 


f(x,y,z) = A181 + A285 + A383 (8.26) 


where (X, A5, A3) are the barycentric coordinates of the point (x, y, z) with respect to 
the triangle P, P2P3 (Eq. 2.48). The above linear mapping is also shown in Fig. 2.12. 
The problem of planar embedding for a triangular mesh therefore reduces to the 
problem of determining the mapping for just the vertices of the triangles. We con- 
sider below a physics-based method for obtaining this mapping for an open mesh. 


8.6.1 Barycentric Embedding 


Imagine a three-dimensional triangular mesh fitted with springs along each edge, 
and rigid links at each vertex where the springs meet. The springs are assumed to 
have a zero rest length. If we stretch this network of springs and place it on a plane 
so that the boundary vertices of the mesh are firmly attached to points around a 
convex polygon, the interior vertices will settle in a minimum energy configuration 
(Fig. 8.34). We then have a planar embedding of the mesh without any fold-over of 
triangles. 

We denote the map of a vertex V; (xi, Yi, zi) on the mesh by P; (uj, vi), i= 1..n. 
The potential energy of the spring attached to the edge P;P; is proportional to the 
square of the displacement: 


1 2 
Ei; = 5Ki| Pi - Pil ibn) 


where K;; is the spring constant. For any point P;, if N; denotes the set of indices of 
its one-ring neighbours, the sum of potential energies of all edges incident at P; is 
given by 


E(Pi) = ; Y Ki Pi - Pil im sso. (8.28) 
j€Ni 
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Fig. 8.34 Planar embedding of a mesh 


The total potential energy of the system is obtained by adding up the above values 
for every vertex. We note that every edge is counted twice in the summation and 
therefore we further multiply the result by half. 


1 
E-i ue Ky|P; - P|? (8.29) 


For the minimum energy configuration, the partial derivatives of the above 
expression with respect to the variable P; must be zero. Hence 


JEN; 


From the above equation, we get 


JEN; 
where 
= 8.32 
Bij py» Ki, ( ) 
r€Ni 


Since Kjs are all positive and there are at least two edges incident at a vertex, we 
have 0 < Bj; < 1 for all j € N;. Thus Eq. 8.31 expresses P; as a convex combination 
of its one-ring neighbours in the planar domain. Let the boundary vertices be 
given by P,41,... P, for some value of m <n. Since these vertices are fixed at 
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known positions around a convex polygon, the only unknowns to be determined are 
the locations of the interior vertices P, ... Pm. Equation 8.31 can be re-written as 
follows: 


P; == 5 Bij Pj = > Bij Pj = Qi. i= 1,..m. (8.33) 
j€Ni j € Ni 
jzm j>m 


If we set Bj; =0 if j Z Nj, then the above set of linear equations can be written as 
a single matrix equation: 


MET m P Qı 
—fn 1 oe, —Bom P Z Q2 


(8.34) 
— Pm — m2 idi 1 Py Om 


The above equation in fact represents two equations in u and v coordinates of 
the interior points P;. Since 0 € By <1, the m x m matrix in the above equation is 
diagonally dominant as well as non-singular. The planar locations of the interior 
points are therefore given by 


=} 
Pi 1 —By +--+ —Bim Qi 
P - —f2n 1 n — fam Q» (8.35) 
Pa —fm —Pm2 mnm 1 Om 


Box 8.4 Commonly Used Expressions for Spring Constants (Edge 
Weights) Kj 


Wachspress Metric: 


cot Yj; + cot di; 
2 
ij 


Kj = 
J 
3 


Discrete Harmonic Metric: 


Kij = cotô; + cotó;; 
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Mean Value Metric: 


T tan (2) + tan (4) 


ij 
Fij 


The values of Q1, ... Qm can be pre-computed using Eq. 8.33. A simple choice 
for Bj; is 


-— 
bi} HEN (8.36) 


The above setting is equivalent to assigning a unit value for all spring constants 
(Kj; = 1, for j € Ni, for all i). This also implies that for a given i, the value of fis 
are all equal and independent of j. The position of a vertex relative to its neighbours 
is thus ignored. In fact, Eq. 8.31 places P; at the barycentric centre of the closed 
polygon formed by its one-ring neighbours. Also note that the definition of fj; is 
not symmetric, i.e., Bj; Æ ji. A few other commonly used metrics for Kj are listed 
in Box 8.4. These metrics capture information about the geometry of the mesh 
surrounding an edge using distances and angles within the triangles that border the 
edge. For each metric, the values of K;; are further normalized using Eq. 8.32 to 
obtain the corresponding values of B;. The metrics are defined using the angles 
within the adjacent triangles of the edge V;V; of the original mesh. 

The inverse of the matrix in Eq. 8.35 can be computed easily for simple meshes 
only, when m is small. For large values of m, we can solve the system iteratively 
by either Jacobi or Gauss-Seidel methods. Rewriting Eq. 8.33 as an update equation 
for P; in the (k + 1)" iteration in terms of the values of P; in the previous iteration 
k, we have the following solution based on the Jacobi method: 


k ; 
BU =O BP pega k=0,1,... (8.37) 
j€Ni 
j zm 
The Gauss-Seidel method uses the updated values of Pj, ..., Pj, and the 
previous values P;+1, ... Pm to update P;: 
k+1 k+l 
pU sd 95 pre’ 
j € Ni 
j<i<m 
+ X yP®,i=1,.m,k=0,1,... (8.38) 
JEN; 


i<j<m 
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Fig. 8.35 Spherical embedding of a triangular mesh 


The advantage of the Gauss-Seidel method over Jacobi method is that the values 
of Pj can be sequentially updated in place within the same list without having to 
maintain two separate lists for the previous and the updated values. In both the above 
Cases, a convergence criterion is used to determine when the iteration must stop: 


PD- pO <e i=1,.m, k=0,1,... (8.39) 


where € is a user specified threshold that is independent of i. 


8.6.2 Spherical Embedding 


The methods presented in the previous section are suitable for open manifold 
meshes. A closed manifold mesh, on the other hand, is topologically equivalent 
to a sphere, and therefore the natural parameterization domain for such meshes is a 
sphere. A spherical embedding generates a mapping of vertices of a closed mesh to 
points on a sphere. As a consequence, triangles of the mesh get mapped to spherical 
triangles (Fig. 8.35). For a triangular mesh, the mapped set of spherical triangles 
must form a partition of the sphere. The embedding associates a pair of spherical 
coordinates (œ, ô), 0 < æ < 2x, —7/2 < ô < 1/2, with every three-dimensional vertex 
of the mesh. 

For geometrically simple closed meshes centred at the origin, the vertices can 
be directly projected to the surface of a unit sphere using coordinate normalization. 
The spherical coordinates are then extracted from the normalized coordinates (u;, 
vi, wi) using the following equations: 
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(8.40) 


Ju +w 


The above values can be further transformed into the range [0, 1] if they are to 
be used as texture coordinates. For a general triangular mesh, the iterative solution 
for the minimum energy equation in Eq. 8.31 can be extended for a mapping onto a 
unit sphere as follows: 


per = iti —a)P® re y By P;, Pill = I. (8.41) 
j€Ni 


where P; = (ui, vi, wj), i= 1.. n are points on the unit sphere, and À is a damping 
parameter. The value of X is usually set to 0.5. The weights f;; are computed using 
Eq. 8.36. The Gauss-Seidel solver provides the following iterative solution for the 
above equation: 


TC (k) pei pik) k+) Si 
$= -aPM 42 Y? ByPÉ 4a Y? By P,P = BT 
JEN j € Ni 
ped j<i 


(8.42) 


In the next section, we give an outline of another important class of mesh 
processing algorithms, namely polygon triangulation. 


8.7 Polygon Triangulation 


Triangles have the property of being the simplest convex and planar polygonal 
regions. For this reason, triangular meshes are generally preferred to more com- 
plex polygonal meshes by applications involving both processing and rendering 
of meshes. In this section, we consider two important classes of triangulation 
algorithms: 


* Polygon triangulation. 
* Point set triangulation. 


Polygon triangulation is the process of decomposing a polygon into a set of 
triangles such that the vertices of the triangles are the same as the vertices of the 
polygon, and no two triangles intersect. In a triangulation of a polygon, the union 
of all triangles is the complete polygon. An implicit assumption in all polygon 
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triangulation algorithms is the fact that every simple polygon (see next section) can 
be triangulated. Indeed, every simple polygon with n vertices can be decomposed 
into a set of n—2 triangles. 

Point set triangulation is a relatively complex problem of triangulating the convex 
hull of a given set of points on a two-dimensional plane. The vertices of the convex 
hull as well as the interior points of the hull are included in the triangulation. We 
impose the planar restriction here since in a general three-dimensional space, four 
points can be connected together to form tetrahedral regions that enclose a volume. 


8.7.1 Polygon Types 


Polygons are the most fundamental blocks in the construction and processing of 
meshes. The geometric operations that can be performed on a mesh heavily depend 
on the type of polygons used. In this section, we look at some of the important 
polygon classes and their commonly used properties. 

As mentioned in the previous section, mesh algorithms often restrict polygons 
to simple polygons. A simple polygon is defined as a closed polygon without self- 
intersections, and is thus topologically equivalent to a circle. A convex polygon is a 
simple polygon that satisfies several properties. Every line segment connecting two 
points within a convex polygon lies entirely within the polygon. The interior angles 
of a convex polygon are all less than or equal to 180°. Every anticlockwise traversal 
of a convex polygon either continues straight, or turns left at every vertex. Point 
inclusion tests and convex hull algorithms use this property. Convex polygons admit 
simple and straightforward solutions to many processing algorithms. For example, a 
convex polygon can be easily triangulated from any vertex (the resulting triangula- 
tion may not be angle-optimal, though). A regular polygon is a special type of con- 
vex polygon that is both equiangular (all angles are equal) and equilateral (all sides 
are equal). A regular polygon is an approximation of a circle in the sense that as the 
number of sides is increased, the shape of the polygon tends to that of a circle. 

A star-shaped polygon is characterized by the property that there exists at least 
one point within or on the polygon which is visible to every other point inside the 
polygon. Specifically, if is a polygon and if there exists a point Q either on or 
inside I’, such that for every other point Per, the line segment PQ lies entirely 
within I’, then the polygon is star-shaped. The set of all points Q satisfying the above 
condition is called the kernel of the polygon (Fig. 8.36a). A star-shaped polygon can 
thus be defined as a polygon with a non-empty kernel. The kernel of a star-shaped 
polygon is always a convex polygon formed by the intersection of the half-planes of 
all the edges directed towards the interior of the polygon. If the kernel is the interior 
of the whole polygon, then the polygon is obviously convex. Conversely, for every 
convex polygon, the kernel is the interior of polygon itself. 

We saw earlier in Sect. 8.4.1 that the vertex decimation algorithm generates star- 
shaped polygons that require triangulation after the removal of a vertex (Fig. 8.18). 
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Fig. 8.36 (a) The kernel of a star-shaped polygon and (b) its triangulation from a vertex inside the 
kernel. (c) An edge visible polygon 


Upper chain 


Lower chain 


Fig. 8.37 (a) A monotone polygonal chain with respect to the line L. (b) A x-monotone polygon 


If the kernel of a polygon contains a vertex, then the polygon can be triangulated 
from that vertex (Fig. 8.36b). A convex polygon can therefore be triangulated from 
any of its vertices. A polygon is said to be edge-visible if there exists an edge E of 
the polygon such that for any point P within the polygon, there exists a point Q € E 
such that the line segment PQ lies entirely within the polygon (Fig. 8.36c). 

A polygonal line is called a monotone polygonal chain with respect to a line L if 
every line perpendicular to L intersects the chain at most once (Fig. 8.37a). A simple 
polygon is monotone with respect to L if any line orthogonal to L intersects the 
polygon at most twice. A convex polygon is always monotone with respect to any 
line on the plane of the polygon. An x-monotone polygon can be subdivided into 
upper and lower x-monotone chains. The two chains meet at the leftmost and the 
rightmost points of the polygon (Fig. 8.37b). Starting from the leftmost point, the 
x-coordinates of the vertices monotonically increase along each chain. 

A polygon is called a weakly externally visible (WEV) polygon if and only if 
for every point P on the boundary of the polygon, there exists a semi-half line L, 
that does not intersect the polygon anywhere else. In other words, every point on 
the boundary of a WEV polygon is visible to some point at infinity (Fig. 8.38). 
Star-shaped and monotone polygons are clearly externally visible. 
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Fig. 8.38 Every point on the 
boundary of a weakly 
externally visible polygon is 
visible to some point at 
infinity. The polygon on the 
right is not a WEV polygon 


Fig. 8.39 The edge flip operation can be used for producing locally angle-optimal triangulation 


8.7.2 Edge-Flip Algorithm 


A quadrilateral can be triangulated in at most two possible ways. The triangulation 
that gives the maximum value for the minimum angle among the two triangles is 
called angle-optimal (Fig. 8.392). One triangulation can be obtained from the other 
by flipping the dividing edge while making sure that the quadrilateral does not 
contain a reflex vertex. If a vertex is reflex, flipping the edge results in an invalid 
triangulation (Fig. 8.39b). We saw earlier that a convex polygon can be triangulated 
from any vertex. To obtain an angle-optimal triangulation we consider every pair of 
adjacent triangles and flip the common edge if the resulting configuration gives a 
higher value for the minimum angle (Fig. 8.39c). 

In an angle-optimal triangulation, the sum of opposite angles o + 6 is always 
less than 180? (Fig. 8.39a, b). Such a pair of triangles is said to meet the 
Delaunay condition. Then, the triangles also satisfy the condition that interiors of 
the circumcircles of both triangles are point-free. The edge flipping method outlined 
above can be extended to get an optimal triangulation (known as the Delaunay 
triangulation) of a set of points. The algorithm incrementally adds points to a set and 
re-triangulates the set using the edge-flip operation. If a point is added to the interior 
of an existing triangle, the triangle is split into three, and all adjacent triangles are 
checked if they satisfy the Delaunay condition (Fig. 8.402). If the new point falls 
on the edge of an existing triangle, it is connected to its opposite vertices and the 
Delaunay condition is checked for the four pairs of triangles surrounding the vertex 
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Fig. 8.40 Three different cases of the incremental Delaunay triangulation algorithm 


(Fig. 8.40b). If the new point is outside all existing triangles, it is joined to the visible 
vertices of the convex hull of the set and the affected regions are re-triangulated 
(Fig. 8.40c). At any stage, the algorithm thus produces the convex hull of the points 
added so far, along with an angle-optimal triangulation of the hull. 


8.7.3 Three Coins Algorithm 


The Three Coins algorithm is a versatile and easy to implement method that can be 
used for finding both the convex hull and the triangulation of a star-shaped polygon. 
The iterative backtracking algorithm is based on the orientation of three vertices 
(hence the name “three coins") of a polygon. In a general three-dimensional space, 
the orientation of three points A, B, C is defined according to Eq. 2.10. On the other 
hand, if the points are two-dimensional, we use Eq. 2.11 to determine if three points 
make a left turn. 

Let us first consider the algorithm for obtaining the convex hull of a star-shaped 
polygon with n vertices vo, vj, ..., vy, (Fig. 8.41). Assume that the vertices 
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Fig. 8.41 (a) A star-shaped polygon with anticlockwise ordering of vertices. (b) The output of the 
Three Coins algorithm 


are given in the counter-clockwise order, and that vo is a convex vertex (whose 
interior angle is less than 180°). We can always choose the vertex with minimum 
y-coordinate as vo, since vertices with minimum or maximum coordinate values are 
guaranteed to be convex. The Three Coins algorithm uses the fact that on the convex 
hull, any three consecutive vertices must make a left turn when the hull is traversed 
in the anticlockwise direction. Starting with the first three points A = vo, B = vi, 
C = v5, the algorithm checks if A, B, C form a left turn. If so, we move forward in 
the list of vertices by one step by replacing A with B, B with C, and C with the next 
vertex in the list. If the three vertices do not make a left turn, the middle point B 
is deleted from the list, and we move one step backward. This is done by replacing 
A with its predecessor, and B with A. For this particular case, before updating the 
values of A, B, C, we also add a new edge AC. The algorithm can be implemented 
using a stack S as given in the pseudo code below (Listing 8.12). 

The edges added by the Three Coins algorithm are shown as dotted lines in 
Fig. 8.41b. As seen in the figure, the algorithm finds the convex hull of the star- 
shaped polygon, and also triangulates each of its pockets. A pocket is an exterior 
portion of the polygon that lies within its convex hull. Each pocket is a star-shaped 
polygon bounded by an edge of the convex hull. This bounding edge is called a lid. 
A pocket is always edge-visible with respect to its lid. It may also be noted that 
the Three Coins algorithm traverses each pocket in the clockwise direction while 
triangulating it. We can therefore apply the algorithm for triangulating a polygon 
that is edge-visible with respect to an edge E, by ordering its vertices in clockwise 
direction, and initiating the traversal from the second end point of E in the clockwise 
ordering of vertices (Fig. 8.42a). We denote this vertex as vo. Initiating the algorithm 
from a different vertex can lead to an invalid triangulation (Fig. 8.42b). 

We saw earlier in Sect. 8.7.1 that a star-shaped polygon can be triangulated from 
a vertex that lies within the kernel of the polygon. In the most general case, however, 
the kernel may not contain any of the vertices. Also, it may not be possible to directly 
apply the Three Coins algorithm, since a star-shaped polygon may not be edge- 
visible. Before applying the Three Coins algorithm, a star-shaped polygon is split 
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Listing 8.12 Procedure for the Three Coins algorithm 


alts Stack S; 
2; S.push(vo); S.push(vi); //v, on top of stack 
3. k= 1; 
4. while (k « n) do 
SE { 
6. C = vg 
Fy B = S.pop(); 
8. A = S.peek(); 
9. if (points A,B,C are collinear or 
10. make a left turn) 
11. { 
12. S.push(B); S.push(C); 
13. Kk -—EG 
14. } 
Loe else { 
16 Create the edge AC; 
nM M if(A == vo) 
18. { 
19. S.push(C); 
20. kek + 1s 
21. ) 
225 } 
23; } 
a y 
i 
V3 E 7 
Y? f X3 
Yj 
vo E Vn-1 Vo 


Fig. 8.42 (a) Triangulation of an edge-visible polygon using Three Coins algorithm. (b) An 
invalid triangulation resulting from an improper choice of the starting vertex. 


into two by adding an edge from one of the vertices and passing through some point 
in the kernel. In Fig. 8.43a, this edge is shown as the line VP, where V is a vertex 
and P a point in the kernel. This edge intersects the polygon on the other side of 
the point where a temporary extra vertex Q is added. For the vertex decimation 
algorithm, the vertex selected for removal is connected to every vertex in its one- 
ring neighbourhood, and therefore belongs to its kernel (Fig. 8.18). Thus the point P 
is readily obtained. The extra point Q and the edge PQ are removed once both sides 
of the edge are triangulated. 

Since the point P belongs to the kernel, the sub-polygons I1, I? on either side 
of the edge VQ are edge-visible polygons (Fig. 8.43a). By ordering the vertices 
in the clockwise sense, the polygon I; can be triangulated using the Three Coin 
algorithm, starting from vertex V. Similarly 2 can be triangulated starting from 
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c 


Fig. 8.43 (a) A star-shaped polygon is split into two by adding an edge though a vertex V and a 
point P in the kernel. (b) Each sub-polygon bounded by the new edge is triangulated. (c) The new 
edge is removed and the resulting hole is triangulated 


vertex Q (Fig. 8.43b). The splitting edge VQ, the temporary vertex Q, and all edges 
incident at Q are now removed. The hole formed by this operation is actually an 
edge-visible polygon with respect to the edge to which Q belonged. This is because 
Q was previously connected to all vertices of the hole. We can therefore invoke 
the Three Coin algorithm again to triangulate the hole, and this process completes 
the triangulation of the whole star-shaped polygon (Fig. 8.43c). 


8.7.4 Triangulation of Monotone Polygons 


In this section, we consider the triangulation of x-monotone polygons. Any mono- 
tone polygon can be converted to an x-monotone polygon by a single rotational 
transformation of all its vertices. The vertices of the polygon are sorted in the 
ascending order of x coordinates. Let the sorted set be V = (vo, vi, ..., Vn—1}- 
The left-most vertex vo is a convex vertex where the upper and lower monotone 
chains meet (Fig. 8.44a). Similar to the Three Coins algorithm, the algorithm 
for triangulating a monotone polygon P also uses a stack S of vertices which is 
initialized with vo and vı. Vertices are processed in the increasing order of x, and 
triangulation is done by adding edges from these vertices and splitting off triangles 
from the polygon wherever possible. The un-triangulated part of the polygon is 
labelled P'. The vertices stored in the stack at any stage of the algorithm are denoted 
by so, 51, ..., St, Where s; is at the top of the stack. These are vertices that have been 
examined, but could not be fully processed, i.e., edges could not be generated from 
these vertices yet. 
The vertices stored on the stack satisfy the following properties: 


* so is the left-most vertex of the polygon P’ (Fig. 8.44b). 

è 50, 51, ..., $; are consecutive vertices on either the lower chain (Fig. 8.44c) or 
the upper chain (Fig. 8.44d) of the polygon P’. 

* 51, ..., S1 are reflex vertices in P’. 
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Fig. 8.44 (a) Ordering of vertices on an x-monotone polygon. (b) Stack vertices form a 
boundary of the untriangulated polygon P’. (c) A sequence of stack vertices on the lower chain. 
(d) A sequence of stack vertices on the upper chain 


b c 


Fig. 8.45 (a) None of the stack vertices is removed and v is pushed onto the stack. (b) The updated 
stack contains elements s, . . . Sk, v. (c) The updated stack contains only elements s, and v 


Since s, is the last examined vertex, the next vertex vEV in the lexical ordering 
will always be on the right of s,. Depending on the relative position of v with respect 
to the stack vertices, it will be either stored in the stack (becoming the next top of 
stack vertex), or used to create edges thereby removing some vertices from the stack. 
Three possible cases are shown in Fig. 8.45. 


* Case 1: v is a adjacent to s,, and s, is a reflex vertex in P' (Fig. 8.45a). In this 
case, edges cannot be created from v, and therefore v is pushed onto the stack. 

* Case2: v is a adjacent to s;, and s; is a convex vertex in P' (Fig. 8.45b). At least 
one stack vertex can be connected to v by an edge. If the angle sj. s,v is less than 


180? for some k such that 0 < k < f, then vertices sg ... s; can all be connected 
to v. The vertices s,41 ... s, are removed from the stack, and v is pushed onto 
the stack. 


e Case 3: v is adjacent to sọ in P’ (Fig. 8.45c). In this case, v lies on the opposite 
chain as the stack vertices and is therefore visible to every stack vertex. v is 
connected to vertices s1, .., s;, and all stack vertices are removed from the stack. 
s, and v are then pushed onto the stack, with v now on the top of the stack. 
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V3 


v UH 


i Initialization: S = (vo, v1} 3 Case 3; S = (v, v4} 


y 3 


Case 3; S = (vio. v13} 


"5 Case 1; S = (v5, v9, V10 V11} 


Fig. 8.46 Some of the intermediate stages in the triangulation of a x-monotone polygon. The dot 
indicates the current vertex. S represents the stack after update, with the rightmost element on top 
of stack 


In all the three cases above, the current vertex v becomes the next top of stack 
element. The algorithm stops when the last vertex v,.., is pushed onto the stack. 
The process of triangulation of an x-monotone polygon using the above algorithm 
is shown in Fig. 8.46. 

The sorting of the vertices in the pre-processing stage of the algorithm requires 
O(n logn) time. After that, each vertex is examined only once and the algorithm 
performs n iterations. The total number edges added is n—3. Thus the triangulation 
algorithm alone (without considering the pre-processing stage) runs in O(n) time. 
A pseudo-code of the algorithm is given in Listing 8.13. 


8.8 Summary 


This chapter outlined the fundamental properties of polynomial manifold meshes, 
important mesh operations and related algorithms. The chapter began with an 
outline of ASCII mesh file formats that are easy to create, read and modify. Such 
file formats are useful for developing and testing mesh processing algorithms with 
the help of simple polygonal models. Commonly used geometrical and topological 
properties of meshes were then introduced. The winged-edge and half-edge data 
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Listing 8.13 Pseudo-code for the triangulation algorithm 


15 Sort vertices of the polygon in ascending order of x 
245 Sorted set = (vo, ... , Vna? 

3s Stack $; 

Z3 S.push(vo); S.push(v-); //v, on top of stack 
B. k = 2; 

6. while (k < n-1) do 

7. { 

8. vV = Vg //Current vertex 

9. SQ, = S.pop(); //Top of stack 

0. if(adj(v, si) // v is adjacent to s, 
Ta { 

2. w = S.peek(); //Predecessor of są 
3s if (angle (w,s,,v)»-180) 

4. { //Case 1 

5. S.push(s;); 

16. } 

TT. else 

8. 1 //Case 2 

19; do { 

20 addEdge(w, v); 

21 u = S.pop(); 

22 if(S.isEmptyv()) break; 

23 w = S.peek(); 

24 } while (angle(w,u,v)<180); 
25 S.push(u); 

26 } 

21 } 

28 else 

29 { //Case 3 

30. u = Sy 

8X while(not S.isEmpty()) { 

32s addEdge(u, v); 

33. u = S.pop(); 

34. } 

dg S.push(s;); 

36. } 

34 S.push (v); 

38. k = kł1; 

395 } 


structures provide convenient representations of mesh neighbourhood information 
required for processing adjacency queries and localized geometry operations. The 
usefulness of both face based and edge based data structures has been demonstrated 
using examples. 

Mesh simplification algorithms are heavily used in applications requiring mul- 
tiple levels of detail for rendering objects. The vertex decimation and the edge 
collapse algorithms are iterative methods that progressively simplify a mesh based 
on a user-specified cost function. These algorithms could be used to reduce the 
number of vertices, edges and polygons in a mesh while preserving essential 
topological and shape characteristics. The quadric error metric is commonly used 
as the cost function for edge collapse operations. Mesh subdivision algorithms 
iteratively subdivide each triangle or quadrilateral of a mesh and re-adjust the 


226 8 Mesh Processing 


positions of vertices using blending functions. They are used for modelling ob- 
jects from a base mesh by applying a series of smoothing and approximation 
operations. The Loop subdivision algorithm is designed for triangular meshes, 
while the Catmull-Clark algorithm is particularly suitable for quadrilateral meshes. 
The 4/3 algorithm relies on an edge-flip operation for generating a smooth 
surface. 

This chapter also gave an overview of the process of mesh parameterization using 
both planar and spherical embedding. Both methods use physics-based models and 
require iterative solutions of linear systems in the mesh vertex coordinates to obtain 
the embedding of a mesh in a different domain. The chapter concluded with an 
outline of polygon triangulation algorithms that could be applied to star-shaped and 
monotone polygons. 


8.9 Supplementary Material for Chap. 8 


The section Chapter8/Code on the companion website contains the following 
programs implementing and demonstrating the working of key algorithms discussed 
in this chapter. 


]. Mesh.cpp 
Additional files: class 
Mesh.h Mesh 
{ 
he 


Here you will find the header and implementation files for a mesh class. 
Mesh data can be read from files stored in OFF and OBJ formats and internally 
represented using a vertex list and a polygon list. The class supports triangular 
and quadrilateral polygonal manifold meshes, and a half-edge data structure for 
storing the connectivity information. 


2. EdgeCollapse.cpp 


Additional files: 
Mesh.h 
Mesh.cpp 
mesh.off 
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The program shows the working of the edge collapse algorithm described in 
Sect. 8.4.2. Mesh data is read from the file *mesh.off". The mesh is assumed to be 
a closed triangular mesh. The edge with the minimum error metric is highlighted 
in each step. Press space bar to advance to the next iteration. 


3. LoopSubdivision.cpp 


Additional files: 
Mesh.h 
Mesh.cpp 
mesh.off 


The program demonstrates the working of the Loop subdivision algorithm 
given in Sect. 8.5.2. Mesh data is read from the file *mesh.off". The mesh is 
assumed to be a closed triangular mesh. Press space bar to advance to the next 
iteration. The maximum number of iterations is set at 4. Pressing *w' displays 
the wireframe model (default), and 's' displays the solid model. 


4. CatmullClark.cpp 


Additional files: 
Mesh.h 
Mesh.cpp 
mesh.off 


The program performs Catmull Clark subdivision on an input quadrilateral 
mesh, as described in Sect. 8.5.3. Mesh data is read from the file *mesh.off". The 
mesh is assumed to be a closed quadrilateral mesh. Press space bar to advance 
to the next iteration. The maximum number of iterations is set at 4. Pressing ‘w’ 
displays the wireframe model (default), and ‘s’ displays the solid model. 


5. Delaunay.cpp 


Additional files: 


none 
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The program generates an incremental Delaunay triangulation of a set of 
user specified points using the edge flip operation (Sect. 8.7.2). The points are 
specified interactively using mouse input (left button). The maximum number of 
points is set at 20. 


6. ThreeCoins.cpp 


Additional files: 
polygon.dat 


The program demonstrates the Three Coins algorithm by triangulating an edge 
visible polygon. The polygon definition is read into the program from the file 
"polygon.dat". The algorithm always starts from the first point in the input vertex 
list. The user should therefore ensure that the polygon is edge visible, and the 
first vertex is the correct initial vertex on the visible edge. The vertices are also 
assumed to be ordered in the clockwise sense. The algorithm may not generate a 
valid triangulation if any of these conditions is not satisfied. 


8.10 Bibliographical Notes 


Several books on computer graphics such as Foley (1996), Nielsen (2005) and 
Shirley and Ashikhmin (2007) give a good coverage of fundamental mesh pro- 
cessing algorithms. There have been a few recent publications (e.g., Botsch 2010; 
Edelsbrunner 2001; De Loera et al. 2010) that primarily deal with mesh algorithms 
and therefore serve as excellent references for development and research in this area. 

The winged edge data structure was introduced by Baumgart (1972), and 
several enhancements have since been proposed by researchers for various types 
of mesh operations. Kettner (1998) gives a detailed description and comparison of 
edge-based representations of polyhedral meshes. Schroeder et al. (1992) presents 
the vertex decimation algorithm and its implementation aspects. A simple imple- 
mentation of the edge collapse algorithm is given in Melax (1998). The quadric 
error metric (QEM) for the edge collapse operation was introduced by Michael 
Garland in his Ph.D thesis (Garland 1999). The Loop algorithm for the subdivision 
of triangular meshes was proposed by Charles Loop (1987). Details of the Catmull- 
Clark algorithm can be found in Catmull and Clark (1978). The 4/3 subdivision 
algorithm is discussed at length in Kobbelt (2000). A comprehensive analysis of 
subdivision algorithms can be found in Zorin (2006). 

An introductory paper on mesh parameterization methods can be found in Bennis 
et al. (1991). Recent papers by Floater and Hormann (2005) and Saba et al. (2005) 
give an in-depth analysis of mesh parameterization algorithms. 
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Chapter 9 
Collision Detection 


Overview 


Collision detection is an integral component of game engines that are designed to 
provide realistic animations of object interactions with the player and the game 
environment. Physically realistic dynamic simulations such as flight simulators 
and mobile robot simulators also require efficient collision detection algorithms. 
Intersection tests form the backbone of collision detection algorithms. They are also 
used in ray tracing algorithms, acceleration algorithms such as view frustum culling 
and portal culling, and in real-time animations. This chapter gives an extensive 
coverage of methods used for testing if primitives and bounding volumes overlap. 

Collision detection in a large scene consisting of several objects requires efficient 
methods to minimize the number of intersection tests. This chapter discusses the 
usefulness of bounding volume hierarchies and spatial partitioning trees such as 
Octrees, k-d trees and bounding interval hierarchies, and includes a coverage of 
important algorithms in each category. 


9.1 Bounding Volumes 


Bounding volumes provide a convenient approximation of the space occupied by 
a mesh object or a collection of objects for the purpose of intersection testing and 
collision detection. A set of objects represented by a bounding volume must be 
contained entirely within the volume, so that if another object does not intersect 
this volume, it can be readily concluded that the object does not intersect anything 
within it. Commonly used bounding volumes are axis aligned bounding boxes 
(AABB), oriented bounding boxes (OBB), spheres, convex hulls and discrete 
oriented polytopes (k-DOP). All of these are closed volumes and have a convex 
shape. AABBs and spheres were introduced in Sect. 3.4 and their computation 
was given in Box 3.1 (Sect. 3.4). In this section, we will explore additional 
properties of these bounding volumes and also consider other relatively more 
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Fig. 9.1 Two different representations of an axis-aligned bounding box 


complex geometries. It should be noted that mesh models of bounding volumes 
are not needed for collision detection algorithms, and therefore only mathematical 
representations of the regions they enclose are generally used. A mesh model is 
sometimes created only for the purpose of visualizing an algorithm. 


9.1.1 Axis Aligned Bounding Box (AABB) 


An axis aligned bounding box is given by six parameters (Xmin, Ymin, Zmin), (max, 
Ymax, Zmax) representing the coordinates of two diagonally opposite vertices of the 
box. The bounding volume can also be defined by its mid point (Xmid, Ymid, Zmia) and 
the three half-width extents x,, y,, zr along the principal axes directions (Fig. 9.1). 
The advantage of this representation over the former is that any translation of the box 
can be modelled by updating only the three coordinates of the midpoint, whereas in 
the former representation all six coordinates would need to be updated. 

Given a set of mesh vertices with coordinates {x;, yj, zi}, i— 0... n—1, we can 
compute the AABB parameters for the mesh object as follows: 


1 1 
= (max T Y min); Zmid = z Emax * Zmin) 


Xmid — z \Xmax Xmin), Ymid = 
d zí + Xmin), Ymid 2 


1 1 1 
r =z Umax ~ Amin), Vr = FZ VY max — Ymin); <r = FKmax — <min 9.1 
Xr =a Xmin), Yr = 50 Vmin), Zr = zC Zmin) (9.1) 


where Xmin, Xmax Etc., are computed as given in Box 3.1 (Sect. 3.4). 


9.1.2 Minimal Bounding Sphere 


A sphere enclosing a set of vertices can be readily obtained by computing the 
centroid of the points and finding the maximum distance of the points from the 


9.1 Bounding Volumes 233 


Bounding circle with 
centre at the centroid of 


: Bounding circle with 
the vertices (A) 


centre at the midpoint of 
the AABB (B) 


Fig. 9.2 A configuration of vertices for which neither the centroid of the points nor the centre of 
the AABB gives the minimal bounding circle 


Fig. 9.3 The minimal 
bounding circle is updated to 
contain the new point on the 
boundary 


centre. Such a computation can lead to a larger than required bounding volume 
if the points are distributed unevenly or concentrated at one end of a mesh. 
Computation of the sphere from the AABB (see Box 3.1, Sect. 3.4) often gives a 
better approximation of the volume occupied by the mesh vertices. However, the 
method also does not give the optimal sphere that has the minimum volume. A two- 
dimensional version of this situation is illustrated in Fig. 9.2. 

The Welzl’s algorithm is an incremental method for the computation of the 
optimal bounding sphere of a given set of n points S, = (Po, P1,..., Pn—1}. The 
algorithm is based on the fact that if the smallest bounding sphere D; for the points 
S; = (Po, P1, ... , Pi—1 } is updated to include another point P; that lies outside D;, the 
new minimal bounding sphere D;,., must have P; on its surface. A two-dimensional 
example is given in Fig. 9.3. 

The implementation of the Welzl’s algorithm as a recursive function is given 
as a pseudo-code in Listing 9.1. The function is invoked as minSphere(S, n, 
B, 0); where the variable S represents the set S,, and B is a set of boundary 
points which is initially empty. As shown below, the minimum sphere D; is defined 
in terms of the minimum sphere D,. ; for k—1 points, with the point Pj..; removed. 
The initial value of k is n. 
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Listing 9.1 Pseudo code for 2D Welzl's algorithm 


1. Sphere minSphere(Point pt[], int np, 


Points bnd[], int nb) 
2 if (np==1) 
3 { 
4. if (nb==0) return Spherelpt (pt[0]); 
m if (nb==1) return Sphere2pts (pt[0],bnd[0]); 
6 } 
7 else if (np==0) 
8. if (nb--1) return Spherelpt (bnd[0]); 
9. if (nb==2) return Sphere2pts(bnd[0], bnd[1]); 
iO. 3 
11. if (nb==3) return Sphere3pts(bnd[0], bnd[1], bnd[2]); 
12. Sphere D = minSphere(pt, np-1, bnd, nb); 
13. if (D.isInside(pt[np-11)) return D; 


14. bnd[nb] = pt[np-1]; 

35. nb++4 

16. D = minSphere(pt, np-1, bnd, nb); 
17. return D 


Dy =Dk-1, if Pk- € Dia 
— Dmin(Sy 1, Pk 1) (9.2) 


where Dmin(Sk—1, Px—1) is the smallest disc enclosing the set of points Sx—ı with Pj. 
on the boundary. The minimal sphere at any stage of the algorithm is represented by 
the pair (c, r} where c is the centre and r the radius. The spheres for the base cases 
are defined by the functions Soherelpt (), Sphere2pts()etc., as follows. 


Spherelpt(P): centre c = P, radius r= 0 

Sphere2pts(P, Q): centre c = (P + Q)/2, radius = |c—P| 
Sphere3pts(P, Q, R): centre c, radius r as given in Eqs. 9.4 and 9.6. 
Sphere4pts(P, Q, R, T)): centre c, radius r as given in Eq. 9.16. 


As noted above, the minimal bounding sphere passing through two points is 
simply a sphere that has the two points at opposite ends of a diameter. The pair 
of points that are located diametrically opposite on a sphere are called antipodes. 
For three points P, Q, R, the sphere has the circle passing through the points as one 
of its great circles. In this case, the sphere's centre and radius are the same as that 
of the circle passing through the points. The interior angle at the vertex R of the 
triangle PQR is given by 


| |axd| 
ja] |b| 


(9.3) 


where a = P — R, and b = Q — R. 
The radius of the circle (and therefore the sphere) through the three points can 
now be obtained as 


_ |b-a| _ |b—al lal |5| 


= = 9.4 
2sin 0 2 |a x b| va 
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Fig. 9.4 Minimum sphere 
passing through three points 


The directions along the perpendicular bisectors of the sides a and b towards the 
centre of the circle are given by (see Fig. 9.4) 
n, —(axb)xa 


np — bx (a x b) (9.5) 


The position of the centre can be concisely expressed in terms of the above 
vectors as 
2 2 
_ |b| na + |a| n, 


9.6 
2ja x b|? dn 


where s = (xs, ys, Zs) denotes the position of the point R. 

Four non-coplanar points P. Q, R, T uniquely determine a sphere in three 
dimensions. The parameters of the sphere are obtained from the most general form 
of a sphere given in terms of its centre c = (Xe, Ye, Zc) and radius r: 

(x Xe) + (Y — ye)? + (zz) —r?, or equivalently, 
(e+ y? +2) E ux E vy -wz Fk 20 (9.7) 


where u = —2x,, v = —2ye, w = —2z,, and k= Xe + y + z? — r°. Since the four 
points P. Q, R, T are required to lie on the above sphere, we can substitute the 
coordinates of each point and obtain the following simultaneous equations: 


(xp? + Yp t zy?) -FuXp d vyp t wzp +k =0 

Gu? + ya uw) ux, + vys + wa tk =0 

(xs? + y? +z) b ux, + ys wa +k =0 
(x? + ye +27) + ux, + vy, + wz +k =0 (9.8) 
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Treating the above set as a system of linear equations in the unknowns u, v, w, 
and k, we get the following equation in the coefficients: 


+y ++ x y zl 


C + y+) Xp Yp Zp 1 


(x2 x tZ) * Yq % l| 20. (9.9) 
(x2 + y2 +22) x y zs 1 
(+y +27) x we x1 


The equation of the sphere can be directly obtained by expanding the above 
determinant as follows: 


Mii (x? + y? +2) - Mpx - Misy - Muz + Mis = 0 (9.10) 


where Mj; are the minors given by 


(9.11) 


ES 

= 

II 

= 
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KS 
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Ss 
———d- 


(9.12) 


(9.13) 


My = (9.14) 
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(9.15) 


x Tyra Xs Ys Zs 


Mis = (s 
(x5 
(x? 


x A TEA 
x? +y +z) x 


Since the four points are non-coplanar, from Eq. 9.11 we find that M;; Z 0. The 
centre and the radius of the minimal sphere are now readily obtained as follows: 


"e Mi -Ms Mig 
2M; 2M, 2M, 


M 
r= pee (M5) (9.16) 


9.1.3 Oriented Bounding Box (OBB) 


The oriented bounding box (OBB) gives a closer approximation of the underlying 
mesh geometry compared to the AABB and the sphere. An OBB can be thought 
of as a rotated AABB, whose axes are aligned along mutually orthogonal principal 
directions of variance of the points with respect to the centroid. If the vertices of a 
mesh object are given by (xi, yi, zi}, i— 0... n—1, we can compute their centroid 
(X, y, Z), and form the following matrix: 


Xo0— X X4— X ... Xg—1— X 
V= | yo- yi- Yn- (9.17) 
Zo—Z Zz—Z... Zn-1—Z 


The scatter (or covariance) matrix C isa 3 x 3 symmetric matrix given by 


n—l 
È Ge > (te — X) — F) > (tk — Nek — 2) 


c=- (vv") =- » Gi — X) Ok — F) r Ok- 5 pi Ok- ce —3) 
= —| 
pr xk — 3) - 2) > (ve — P- 2 (a — 9* 
o2 Oxy Ox 
=| Oxy ay Oyz (9.18) 
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Fig. 9.5 (a) The eigenvalues and the eigenvectors of the covariance matrix can be used to compute 
an ellipse with axes along directions of maximum and minimum variance. (b) The oriented 
bounding box uses the ellipse's parameters 


where o? denotes the variance of the vector {xi}, Oxy the covariance between the 
vectors (xj), {y;}, and so on. The above matrix therefore has real eigenvalues A, 
À5, À3 and a mutually orthogonal set of eigenvectors v1, v2, v3. The normalization 
of each of these vectors yields an orthonormal basis e;, e», e3. Treating the set of 
mesh vertices {x;, yj, zi], 1i— 0... n—1, as a point cloud, the unit vectors ej, e», es 
define the principal axes directions of an ellipsoid with corresponding semi-axis 
lengths V1, /A2 and ./A3 (Fig. 9.5a). The OBB also has the same axis directions 
and half-width extents (Fig. 9.5b). The OBB can thus be completely specified by its 
centre (X, y, Z), its half-width extents wi (=./A1), w2 (=/A2), w3 (=/A3), and 
unit vectors e;, €», €3 along its axes. 

The matrix M with ej, e», e3 as the column vectors gives the rotational 
transformation of the OBB with respect to the coordinate reference frame. The 
OBB does not always provide a tight fitting bounding box for the point cloud. This 
is because the covariance matrix depends on the distribution of the whole set of 
points, not just the points on the boundary that define the shape. Even changes in 
the locations of vertices that are inside the point cloud can affect the orientation of 
the OBB. One possible solution to this problem is to consider only the vertices on 
the convex hull of the mesh for the computation of the OBB. Another method that 
is used for the construction of the optimal OBB is to select the axis of the smallest 
eigenvalue, project all points on a plane perpendicular to this axis, and to compute 
the minimum area bounding rectangle of the projection. The chosen axis and the 
axes of the rectangle together define the orientation of the OBB. If we assume that 
A, Z À2 >= Az, then e; is the axis of projection. The points are then projected onto 
a plane orthogonal to e3, and the minimal rectangle of this set gives the other two 
axes e,' and e»! (Fig. 9.6a). 

The minimal rectangle of a set of points on a plane can be obtained using the 
rotating calipers method. The method uses the convex hull of the points and two 
orthogonal pairs of support lines (Fig. 9.6b) such that one of the lines is always 


9.1 Bounding Volumes 239 


Fig. 9.6 (a) Computation of the optimal OBB using a projection of the vertices orthogonal to the 
axis of minimum eigenvalue. (b) The rotating calipers method 


aligned with an edge of the convex hull. The remaining three lines pass through 
the vertices of the convex hull. The angles 0; (i= 1...4) made by each support 
line with the edge of the convex hull in anticlockwise order are computed and the 
minimum angle is found. All four support lines are rotated about the corresponding 
vertices of the hull by the minimum angle, and this step aligns one of the lines with 
another edge of the convex hull. The area of the newly formed rectangle is computed 
and the minimum area updated. The process is repeated until all edges of the convex 
hull have been processed. The computation of the convex hull using algorithms such 
as the Graham's Scan algorithm takes O(nlogn) time. The rotating calipers method 
visits all edges of the hull in O(n) time. The overall complexity of the optimal OBB 
computation algorithm is therefore O(nlogn). 


9.1.4 Discrete Oriented Polytope (k-DOP) 


A polytope is a general term for a polyhedron in any arbitrary dimension. It is 
defined as a geometrical object with flat surfaces. Points, line segments, polygons 
and polyhedrons are respectively zero-, one-, two- and three-dimensional polytopes. 
In a three-dimensional space, a discrete oriented polytope is a closed convex 
polyhedron bounded by k/2 pairs of parallel planes, where k is an even integer and 
k > 6. Each pair of planes has a fixed orientation. Such a polyhedron is often referred 
to as a k-DOP. An AABB and an OBB are both bounded by three pairs of parallel 
planes, and are therefore 6-DOPs. A 10-DOP and the associated normal directions 
are shown in Fig. 9.7a. A k-DOP need not always have k sides. The example in 
Fig. 9.7b is a 8-DOP with only six sides, with two sides degenerating into points. 
The components of the surface normal vectors of a k-DOP are usually chosen 
from the set (—1, 0, +1}. Each pair of parallel sides of a k-DOP has a fixed normal 
direction nj, j — 0, ... (k/2)—1. Their positions are determined such that the region 
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(0, 1, 0) 


(1, 0, -1) 


(0, 0, 1) Td 


(1, 0, 0) 
(1, 0, 1) 


Fig. 9.7 (a) A 10-DOP and the normal directions of five of its sides. The remaining five sides are 
parallel to these and have opposite normal directions. (b) An 8-DOP with degenerate edges 


a 


Fig. 9.8 (a) The positions of the planes are determined such that the vertices are tightly packed 
within each slab. (b) A slab formed by two parallel planes 


between them tightly encloses the mesh vertices (Fig. 9.8a). Now consider two 
parallel planes I';, I'; having a normal direction n; and passing through points pı, 
p» respectively. The region between the parallel planes is called a slab. If a vertex v; 
belongs to the slab formed by T1, I2, then v; satisfies the equations 

(vj — p,)en; 20, and (v;—p;)en; <0. (9.19) 


In a k-DOP, the values of pı, p2 are determined by the minimum and the 
maximum positions of the projections of mesh vertices vj, i— 0...n—1, on the 
vector n; (Fig. 9.8b): 

dij =vj en; 
dminj = min; (dij) 
dmax; = max; (dij) 


pı = (dminj)nj; p, = (dmax;)nj (9.20) 
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The k-DOP can be represented by the set (dminj, dmax;, nj}, j=0...(k/2)—1, 
that gives the k/2 intervals and direction vectors. The points pı, p2 were used only 
for the purpose of explaining the construction of the intervals, and are not stored. For 
fast overlap tests, it is desirable to have all k-DOPs in an application share surface 
normal directions from the same set. If the normal directions are pre-defined, then 
only the minimum and maximum values {dmin,, dmax;}, j=0...(k/2)—1, need be 
stored. A point p belongs to a k-DOP if and only if the following conditions are 
simultaneously satisfied: 


dmin; < (p è nj) < dmax;, for all j. (9.21) 


9.1.5 Convex Hulls 


The key properties of a convex polygon were outlined in Sect. 8.7.1. A convex 
hull of a set of points in two dimensions is defined as the unique convex polygon 
containing all the points and whose vertices are only points in the set. It is 
the minimal convex polygon formed by the intersection of all convex polygons 
containing the points. The convex hull therefore is the tightest fitting bounding 
volume. The convex hull can also be defined as the union of all triangles that 
can be formed using only points of the set. In this section, we will first outline 
the construction of two-dimensional convex hulls using an incremental hull update 
algorithm, and then extend the method to three-dimensional hulls. 

The 2D incremental hull algorithm builds the hull starting with a triangle formed 
by joining the first three points in the set (provided they are non-collinear), and then 
iteratively adds one point at a time to the existing hull and updates it if necessary. 
If the vertices of the initial triangle are oriented in the anticlockwise sense, then the 
vertices of the convex hulls constructed in subsequent steps will also be oriented 
in the anticlockwise sense. Assume that the given set of points is $, = (Po, P1, ..., 
P, 4), and the algorithm has constructed the convex hull of the first i points S; = (Po, 
Py, ..., Pa) G xi «n). We denote this convex hull by C; = (Qo, ... Q4 1) € Sj. 
When the next point P; is added, the convex hull Cj is traversed to check if the point 
is within the hull. This can be done by computing the signed area of the triangle 
QjOj41Pi, j — 0... k—1, (Qk = Qo) is positive (Eq. 2.9) for all j. If the point P; is 
inside or on the hull, it is not updated, i.e., C;4; = C;. If the point is outside the hull, 
the signed area changes sign from positive to negative at some vertex on the hull, 
and from negative back to positive at some other vertex. These vertices are called 
the split vertex and the merge vertex respectively (Fig. 9.9). Existing edges between 
these vertices are removed, and the point P; are connected to these vertices to form 
the new convex hull C;+1. The vertex set C; is updated to C;,.; by removing the 
vertices between the split and the merge vertices and adding P;. 

At each step, the algorithm requires the traversal of the hull vertices to determine 
the locations of the split and merge vertices. The overall time complexity of the 
algorithm is therefore O(n?) for a naive implementation. O(nlogn) implementations 
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Merge vertex 


Split vertex 


Fig. 9.9 Incremental construction of a two-dimensional convex hull 


Silhouette edge 
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Fig. 9.10 Incremental construction of a three-dimensional convex hull 


exist, but they cannot be directly extended to the three-dimensional convex hull 
algorithm. One of the popular algorithms in this class is the Graham Scan, which 
is the same as the Three-Coins algorithm (Sect. 8.7.3) with a pre-processing phase 
where the points P; are initially sorted in the ascending order of the angles between 
the vectors p; = P;—Po and the x-axis. 

A natural extension of the incremental algorithm described above to a three- 
dimensional data set S, consisting of n mesh vertices can be easily formulated. 
The algorithm begins with the construction of a tetrahedron from four non-coplanar 
points of the set S,. The point inclusion test in the three-dimensional case uses 
signed distances (Eq. 2.24) instead of signed areas to determine if a new point P; 
is within the existing convex hull. If the surface normal vectors of the triangles of 
the convex hull are all specified in the outward direction, then for any point inside 
the hull, the signed distance with respect to every triangle of the hull is negative. 
Otherwise, the point is outside the hull, and we can determine the edges between 
triangles where the transition from negative to positive takes place. These edges are 
called silhouette edges. Every triangle for which the signed distance is positive is 
visible with respect to the point P;. These triangles are removed, and the point P; 
is connected to the end points of every silhouette edge to form new triangles of the 
updated convex hull (Fig. 9.10). 

At each step, the point inclusion step and the hull update step take O(n) time. The 
total time complexity of the 3D incremental hull discussed above is thus O(n’). 
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9.2 Intersection Testing 


The bounding volumes introduced in the previous section are often associated with 
methods that determine if two bounding volumes intersect. In this section, we will 
consider three different types of intersection tests using bounding volumes: 


* Intersection between a bounding volume and a ray. Such intersection tests are 
used in advanced ray tracing algorithms where bounding volumes are employed 
to minimize the computation of ray intersections. 

* [ntersection between a bounding volume and a plane. Intersection testing of 
objects with planes is used in acceleration algorithms such as view frustum 
culling. 

* [ntersection between two bounding volumes of the same type. Collision detection 
algorithms often require the testing of intersections between bounding volumes. 


A ray can be represented by the pair (p, m}, where p = (Xp, yp, Zp) denotes the 
origin of the ray, and m = (Xm, Ym, Zm) a unit vector along the ray's direction. In 
parametric form (see also Eq. 2.13), the ray is given by the following equations: 


X = Xp +tXm 


y = Yp T ym 
Z= Zpcízm, t = 0. (9.22) 


A plane always has a linear representation ax -- by + cz 4- d —0, where the 
vector n — (a, b, c) is along the direction of its normal vector. As shown in Eq. 
2.22, the equivalent vector representation of a plane is ren = —d. If we assume that 
n is a unit normal vector, then the signed distance D of a point v to this plane is 
simply given by the expression ven + d (see Eq. 2.24). 

The following sections discuss methods of testing whether a bounding volume 
with a given representation intersects these primitive geometrical elements. 


9.2.1 AABB Intersection 


We first consider the intersection of an AABB given by the parameter set {Xmin, 
Xmax» Ymin» Ymax» Zmins Zmax} With a ray (p, m} as in Eq. 9.22. A naive approach is to 
check if the ray intersects any of the six sides of the AABB. For example, one of 
the sides parallel to the xy-plane is given by the equation z = Zmin. We first compute 
the value of t from Eq. 9.22 by substituting z = Zmin, and then use this value to find 
x and y. The value of t is denoted as TUE The ray intersects this plane if and only if 


the following three conditions are simultaneously satisfied: 
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^min 


7 
^max Z 


Fig. 9.11 AABB intervals along the principal axes, and a non-intersecting ray 


2 Zmin — Zp -0 
Zm B 


(2) " 
Xmin € (xp A l nin Xm < Xmax, 


I^ 


Ymin (% "b tiw) < Ymax (9.23) 
Similarly, the ray can be tested against the other planes of the AABB. A 
moment’s thought will reveal that it is not always necessary to test all six planes 
for intersection with the ray. At most three sides of the AABB will be visible to the 
point p if the point is outside the box (which is usually the case). We can also make 
use of the axis-aligned nature of the sides to determine if the ray is directed away 
from the AABB and therefore would not intersect the volume (Fig. 9.11). If a ray 
satisfies any of the following six conditions, it will not intersect the AABB. 


(Xp < Xmin) and (Xm < 0) 
(Yp < Ymin) and (Ym < 0) 
(Zp < Zmin) and (Zm < 0) 
(Xp > Xmax) and (Xm > 0) 
(Yp > Ymax) and (Ym 2 0) 
(Zp > Zmax) and (Zm > 0) (9.24) 
If none of the above conditions is true, then we can identify the three faces visible 
to p by comparing its coordinates against the corresponding intervals of the AABB. 


As an example, if (x, > Xmax) and (yp < ymin) and (Zp > Zmax), then the ray need be 
compared only with the planes x = Xmax, Y = Ymin, and z 2 Zmax- 
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Fig. 9.12 AABB-Plane 
intersection test using the 
projected distances along the 
normal vector 


The testing of intersection of an AABB with a plane ren = —d, |n| = 1, can be 
done by computing the signed distances of the vertices v;, i = 1..8, of the AABB with 
respect to the plane. If some vertices are on or above the plane, and some below the 
plane, then the plane intersects the AABB. In other words, if D; = vj*n + d is non- 
zero and has a positive value for all i, then the AABB is entirely above the plane; 
if D; is non-zero and has a negative value for all i, then the AABB is below the 
plane; otherwise the plane intersects the AABB. The amount of computation can be 
reduced by first determining the diagonal of the AABB that is closely aligned with 
the normal vector zt, and then using only the two opposite endpoints of this diagonal 
to check if they are on either sides of the plane. Note that an AABB has only four 
principal diagonals, and the selection of the diagonal closest to n is done by using 
the dot-product between n and the unit vectors along the four diagonals. 

Another AABB-plane intersection test (which will be extended to OBBs in the 
next section) uses the projection of diagonal vectors (vectors from the centre of the 
AABB to the vertices) on the normal vector n of the plane (Fig. 9.12). For this 
method, we use the representation of the AABB given by the centre € = (Xmia, Ymid: 
Zmid) and the three half-width extents x,, y+, z-. The largest projected distance by any 
vertex of the AABB on the unit normal vector n = (Xn, Yn, Zn) is given by 


p = |XrXnl + [Yr Yn] + |zrznl (9.25) 

The shortest distance of the centre from the plane is De — cen + d. The plane 
intersects the AABB if and only if 

D. < p (9.26) 

The overlap test using two AABBs can be easily performed taking advantage 


of their axis-aligned property. Since the respective axes are always parallel, two 
AABBSs overlap only if their projected intervals (see Fig. 9.11) along each axis 
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overlap. Thus, two AABBs represented as eee tree y. Vee iste and 


ix. xe, 2 ds ae 2} do not overlap if any of the following conditions 
is satisfied: 
(1) > x2) 


min max 


xD < xO 


max min 


X, 


a) 2 
Jmin > y 


1 (2) 
y. E «y min 


(1) > z2 


min max 


D <2) (9.27) 


Zmax min 


If the AABBs are represented using their midpoint and half-width extents 
as is Vela gee » ^u and iss up sd. IPN, then the 
overlap test can be suitably modified as follows. In this case, the requirement for 
bounding volume overlap is that the projected distance between the centres must 
be less than or equal to the sum of the corresponding half-width extents along each 
axis. Conversely, if any of the following conditions is satisfied, the two AABBs do 


not overlap. 


a) (2) .ü 2 
Xmid — Xmi e (x p ) 


a) (2) 1 9 
mid — Jmid > (yf ! + ys )) 


a) (2) 1 2 
<mid — Smid > (z , + a ?) (9.28) 


Since an AABB is also an OBB, the intersection tests given in the next section 
can also be applied to an AABB. 


9.2.2 OBB Intersection 


If an OBB is given by the parameters (x, y, Z, w1, W2, wa, €1, €», €3}, the eight 
vertices of the bounding volume are given by 


p= £ W1€1 + w€2 + W3e3 (9.29) 
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where c = (x, y, z). The six faces of the OBB have the following equations: 


(r —c)eei — twi 
(r —c)e e — tw; 


(r —c)ees = + ws (9.30) 


To test the intersection of a ray (p, m} (see Eq. 9.22) with the above OBB, a 
brute force algorithm would first identify the faces visible to the point p by taking 
the dot-product of m with the unit vectors e1, e», e3. For example, the positive side 
of e, is not in the direction of (and therefore will not intersect) the ray if any of the 
following two conditions is satisfied. 


(p—c)eei «wi 


mee, ^0 (9.31) 


The point of intersection is obtained by the value of the ray parameter t by 
substituting the ray equation r — p 4- tm in the equations of the planes that are 
visible to the ray. The point of intersection is further checked if it is within the 
corresponding faces, with the help of their vertex coordinates. 

A faster ray intersection test was introduced by Kay and Kajiya (1986) based 
on the representation of the OBB using three slabs, where each slab is bounded 
by a pair of planes given in Eq. 9.30. The parameter values M t for each slab 
(i= 1..3) are computed as shown in the example for i= 1 below: 


If meej20, t = Se, qui ME Ape 
d min (m - ei) ? max (m - e1) 
If mee; <0, iD = Wipe). iD = cine (9.32) 
(m - e1) (m - e1) 


After computing the minimum and maximum values for all three slabs, the 
following values are obtained: 


a) 


min? “min? 


Umin = max L 1: I 


: 1 2 3 
Umax = Min { i ee OR 


(9.33) 


The ray intersects the OBB if Umin < Umax. This condition means that the 
intersection of the three [tmin, fmax] intervals on the ray is non-zero. The two- 
dimensional version of the above method is depicted using an intersecting and a 
non-intersecting ray in Fig. 9.13. 

The testing of intersection of an OBB with a plane ren = —d, |n| = 1, can be 
done exactly like the methods outlined for a AABB in the previous section. The 
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Fig. 9.13 The intersection of rays with the slabs of an OBB 


first method computes the signed distances of the vertices v;, i= 1..8, of the OBB 
with respect to the plane, and if some vertices are found to be on or above the 
plane, while others below the plane, then it is concluded that the plane intersects the 
OBB. This method could be further simplified using only the two end-vertices of 
the principal diagonal that is closely aligned to n. We can also extend the method 
based on projected distances (Fig. 9.12) to OBBs. The longest projected distance on 
n generated by vectors from the centre of the OBB to its vertices is given by 


p = wile; è n| + w»]eo en| + ws]es en| (9.34) 


The above quantity is then compared with the shortest distance of the centre c of 
the OBB from the plane given by D, = cen + d. As previously shown in the case of 
an AABB, the plane intersects the OBB if D, < p. 

A naive algorithm for overlap test between two OBBs would compare every edge 
of one OBB with every face of the other OBB. In total, such a method would require 
12 x 6 x 2 = 144 edge-face intersection tests. The number of intersection tests 
can be considerably reduced if we use the separating axis theorem. The theorem 
states that if two OBBs do not overlap, then there exists a separating plane between 
them, and equivalently, the vertices of the OBBs project into disjoint intervals on 
any axis perpendicular to the separating plane. The direction perpendicular to the 
separating plane is called the separating axis direction. The theorem further states 
that the separating plane, if it exists, is parallel to either a face of one of the OBBs, 
or a plane formed by two edges, one from each OBB. The 15 possible directions for 
the separating axis are: ej, eU, e3®, e, OO, eO, e,O, e, (0 x e, D, eP x e, 
e, O x ey, ej x ei), e; x ey, ej? x e, es? x ei 9, es? x ej, es? 


9.2 Intersection Testing 249 


Separating plane 


Fig. 9.14 Computation of projected distances on a separating axis 


x eO. We denote these directions by J; (i= 1..15). Note that some of these vectors 
are not unit vectors. 

In Fig. 9.14, rı denotes a vector from the centre of the first OBB to one of its 
vertices that give the largest projection on the separating axis /;. Let us momentarily 
assume that J; is a unit vector. Then, the projected distance is given by p;® = [r,«l;|. 
From Eq. 9.29, the eight possible values for rj are + wie; + w2e2 + w3e3. The 
maximum projected distance is obtained by taking the positive values for each 
projected component. Noting that w1, w2 and w3 are always positive, we can define 


pi? = wile, eli| + walea eL;| + ws]ea o L; | (9.35) 
Similarly, for the second OBB we obtain p;” = [r;*lI;|. In the following, we use 


superscripts in the summation to distinguish between parameters associated with the 
first and the second OBBs. 


3 
D| a 
pi = yw, lei? e; 


k=1 
p? = Y wp le? er, (9.36) 
k=1 


The projected distance of the line of centres on J; is D; = |(c2— c) «Ij. The two 
OBBs are separated if, for some i, 


Di > pi? + pj, i—1.15. (9.37) 


The above inequality remains unchanged if we multiply both sides by a constant 
|L|. This means that the separating axis directions J; can be used in the above 
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equations without converting them to unit vectors. From Fig. 9.14, we observe 
that the vector (c®— c?) represents the translation of OBB» with respect to the 
coordinate frame of OBB,. We therefore denote this vector by T. Similarly, let 


Too loi l'o2 
R= | ro ru ro (9.38) 
120 121 122 


denote the rotational transformation of OBB: relative to the coordinate frame of 
OBB,. We can now represent all OBB axes directions relative to the coordinate 
system of OBB, with the origin at its centre: 


ao = (1,0,0), e; — (0, 1,0), e; = (0,0, 1) 


2 2 2 
e1? = (roo. rio, 29), e2® = (roi rii r3), es? = (ro rio, ra) — (9.39) 
In this reference system, we can also write T = tj eji? + toe." + tzez where 


t = (eO — c0) ee, D, = (e Lee e D, t = (e Le) e e, 
(9.40) 


It can be seen that with the above selection of the reference frame, the expressions 
for Dj, pU, p? get highly simplified, reducing the number of operations needed to 
evaluate the inequality in Eq. 9.37. For example, when l; =e," x ej), we get the 
following expressions for the projected distances: 


Dj = |T el;| = |te; e (e, x ei) + bez o (ei x e| 
= | — br + tario] 
pi”) wy? [er e (e, 0 x e1®)| +w Jes 0 (e, x e1®)| 
= w” [rao] + wa C! [rio] 
pi?) = m Jer® e (e x e,9)| +w3® Jes? e (e, x e, 
= w3O | — rur + rario] + wP | — rizr + reariol 


= w2 [roo] + w3 [ror (9.41) 


The last expression given above is derived based on the fact that ej? x 
e; — ey? and ej x e3 =—e”). The complete set of expressions for the above 
quantities for the 15 possible choices of /; is given in Table 9.1. 
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Table 9.1 Formulae for computing projected distances of OBB radii and the line of 
centres for 15 separating axis directions 


Li Di p; pi? 
e; ? tj wi wi [roo] + wo roi 
ws? [roy 
ey? t| Lo wi? rio] -- wa? rii 
T w3 [rio] 
ey t5 ws? wi [rao] + wo rai 
3 ws?) rjj] 
e, O tiro + brio + træl  wi® [ro] + wo? rio] w1 9? 
+w” [roo] 
ej? tiro t fori t tara] wi? ri + wlr] we? 
+w” [rs | 
ey tiro T frio taro] wi?|roltw2 |r| w3? 
+w” |r] 
ea x e, O —tr»o + friol wo) [roo] +w ®|rio] w2 [roo] + ws? lro 
eD x ej? —bry + Bri wo [roi | + w3P |r| wi [roo] + w3 [roo 
eD x eO troy + tsnio| wo) [r5] +93 [ro] wy [roi | + w2 [oo 
ej x e,O fi r2 — fsrool wit? [roo] +03? [roo] — w2 [rio] F ws? Drs 
ey x ej? firi — fargil wi [ro | + w3 | roi wiO [rio] + ws? |rio 
ej x eO tirn — trol wit? [r5] ws? [ro] wP [ri | +w:® [rio 
e x e, O —tirio + troo] wit? [rio] +w®|ro] w2 [a2] + w3 |o 
e x e)? —tiry + toro wi? [rii | + w2” [roi wiO |r| + w3 | roo 
e x eO —tiri + hroz] wit? |r| wot? rog] wi proi | + wo? [rog 


9.2.3 Sphere Intersection 


Sphere intersection tests are relatively simpler than the tests required for other types 
of bounding volumes. Collision detection algorithms often use spheres as the first 
level of bounding volumes so that intersection tests could be quickly carried out. In 
such situations, more accurate computations using tighter bounding volumes are 
performed only if necessary. The condition for the intersection of a ray (p, mj 
with a sphere (c, r} can be easily obtained by substituting the ray's parametric 
representation (see Eq. 9.22) into the sphere's equation: 


(p+tm—c)e(p+tm-—c) =r (9.42) 
Since m is a unit vector, the above equation can be re-written as follows: 
t? 2t me(p—c) - (p-c)e(p—-c)—-r? «0. (9.43) 
The above quadratic in ¢ will have real roots only if 


(im e(p-c) -(p-c)e(p-c)* r? - 0. (9.44) 
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The above inequality gives the necessary condition for the intersection of the ray. 
Depending on the signs of the real roots of the equation (if the above condition is 
satisfied), we can classify the solution into four different categories: 


* Roots are equal and positive. The ray is tangential to the sphere. 

* Roots are unequal and positive. The ray intersects the sphere at two distinct 
points. The minimum value of t gives the point closest to the origin of the ray. 

* One root is positive and the other negative. The origin of the ray is inside the 
sphere. 

* Both roots are negative. The ray is directed away from the sphere and the points 
of intersection are behind the origin of the ray. 


The last condition mentioned above can also be checked using a simple test. If 
m*(c — p) « 0, then the sphere is behind the ray. 

A plane given by ren = —d intersects the sphere (c, r} if the distance of the centre 
of the sphere from the plane is less than the sphere's radius r. Assuming that |n| — 1, 
the necessary condition for intersection is (see also Eq. 2.24) 


(cen)+d<r (9.45) 


Two spheres 1c;, rı} and {c2, r2} intersect if and only if the distance between 
their centres is less than the sum of their radii: 


le1— eo] E ri + ra. (9.46) 


9.2.4 k-DOP Intersection 


We saw in Sect. 9.1.4 that a k-DOP can be represented by k/2 slabs (dmin;, dmax;, 
nj}, j — 0... (k/2)—1, where njs are unit normal directions associated with the slabs. 
The slab-based intersection test outlined in Sect. 9.2.2 can be directly extended for 
a k-DOP as shown below. For a given ray (p, m}, the interval of intersection on each 
slab can be obtained using the following equations (see Eq. 9.32): 


jme cR gS eee). up. Nc Qe) 
ud (m en;) (menj) 

If men; <0, 10 = Dc (penj) iD = w- (Pen) (9.47) 
ae (m en;) (menj) 


where w; = (dmax; — dmin;)/2. After computing the minimum and maximum values 
for all the slabs, the following values are obtained: 


" 
Umin = MaX j fanl 


Umax = min, (£02) (9.48) 


max 
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The ray intersects the k-DOP if and only if the intersection of the slab intervals 
is non-empty. The necessary and sufficient condition for intersection is Umin X Umax- 

An overlap test involving a k-DOP and a plane can be implemented by first 
computing the vertices of the kK-DOP and comparing them with the plane to 
determine if some of them are located either on the plane or on either sides of 
the plane. The vertex positions are obtained by taking three planes of the k-DOP 
at a time and computing their point of intersection using the method described in 
Sect. 2.4. 

The complexity of intersection tests using only k-DOPs can be significantly 
reduced if they are all constructed using the same set of normal directions. Two such 
k-DOPs {dmin}”, dmax;”, nj}, (dminj?, dmax;”, nj}, j 2 0... (k/2)—1, overlap 
if and only if all corresponding pairs of intervals [dmin/?, dmax;], [dminj, 
dmax;?] overlap. Thus, if the following condition is satisfied for any j, then the 
two k-DOPs do not overlap. 


(dmin;” > dmax;”?) or (dmaxj « dmin;”’) (9.49) 


9.2.5 Triangle Intersection 


Bounding volume hierarchies (detailed in the next section) have to facilitate overlap 
tests using not only bounding volumes but also primitives at the lowermost levels 
of the tree. Since triangles are the most widely used mesh primitives, we discuss 
below the intersection tests using triangles. A triangle T will be represented by its 
three vertices as (pi, p2, p3}. The unit normal vector of the plane of the triangle will 
be denoted by n. In the following, we assume that the triangle and the primitive it is 
tested against are both defined in a common reference frame. 

The intersection of a ray (q, mj with the triangle T is computed by first checking 
if the ray intersects the plane of the triangle, and then determining if the point of 
intersection lies within the triangle. If the ray is not parallel to the plane of the 
triangle (men Æ 0), the intersection point is given by the value of the parameter t 
in Eq. 2.23. If t> 0, the ray intersects the plane of the triangle, and the point of 
intersection given by s = q + tm. If the three vector scalar triple products ((po — pi) 
x (s—pi)em, (ps —pz) x (s—p2)em and (pi —p3) x (s —pa)yem all have the 
same sign (see Eq. 2.10), then the point of intersection lies inside the triangle. If 
any of the cross product is zero, the intersection point lies on the boundary of the 
triangle. If it is not necessary to compute the actual point of intersection with the 
triangle, then the above test can be simplified into the condition that the values of 
((pi—4) x (po—qyem, (po —q) x (ps — qyem, (ps —q) x (pı — q)*m have the 
same sign for a valid intersection. We now deal with the case men = 0 separately. 
In this case, the ray must also lie on the plane of the triangle in order to possibly 
intersect it. The necessary condition for this is (q — pi)en = O0. If the condition is 
satisfied, then we compare the ray with the edges of the triangle to determine the 
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Fig. 9.15 Intersection of a ray with an edge of a triangle given by the line segment pip» 


intersection points. As an example, we consider the intersection of the ray {q, mj 
with the line segment pip» (Fig. 9.15), assuming that both lines are coplanar. Let 
u =p2 — pı, and w =q— pı. 

If the ray intersects the line segment, then their projections on to any of the 
principal planes must also intersect. The intersection test can thus be reduced to a 
two-dimensional problem, by selecting only two coordinates for which both u and m 
are non-zero and non-coincident vectors. In this two-dimensional space, let u = (ui, 
u2), m = (mı, m»), and w = (wi, w2). Any point on the line segment pip» is given 
in parametric form as p, + su (0< s < 1), and any point on the ray as q + tm. At a 
valid intersection point, we must have p, + su = q + t m. This equation leads to the 
following two simultaneous linear equations in s and t: 


S uj —tm, —w| 


S U2 — tm» =w2 (9.50) 
from which we get 


wim — W»m| 
A ——— 
uim — u2Mı 


pa eS M (9.51) 


uim — u»m, 


If the above values for s and t satisfy the conditions 0 < s < 1, t > 0, then the ray 
intersects the edge pip» of the triangle. The denominators in the above expressions 
become zero when the vectors u and m become parallel, in which case, the ray does 
not intersect the triangle. 

Intersection tests with a triangle and a plane ren — —d can be performed as 
previously discussed in the context of AABB intersections, by computing the signed 
distances D; of the vertices of the triangle from the plane as D; = p;*n + d, i= 1,2,3. 
If any of the signed distances is zero, or if any two distances have opposite signs, 
then the plane intersects the triangle. 

We now consider the problem of computing the intersection of one triangle 
with another. Here, we assume that the triangles are on two different planes, and 
as previously done for the ray intersection test, we will deal with the intersection 
of two coplanar triangles separately. Let the two triangles be T; = Ípi, p». P3}, 
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Fig. 9.16 An example showing the configuration of two triangles where the plane containing one 
intersects the other 


and T) = {q1, q2, q3}. The plane T; containing the first triangle is given by 


ren; = —d,, where n, is obtained by normalizing the vector (po — pi) x (pa —pi). 
and d; = —p,*n;. Similarly the plane DL? containing the second triangle is also 
obtained as ren; — —d». We then use the procedure outlined in the previous 


paragraph to determine if the first triangle T; is intersected by the plane I'?, and 
if the second triangle 75 is intersected by the plane T';. If any of these tests fails, 
then the triangles do not intersect. Otherwise, the plane of each triangle intersects 
the other triangle as shown in Fig. 9.16. The line of intersection L of the planes also 
intersects both triangles. 

The two triangles overlap if the intervals of the triangles on the line L overlap. 
The equation of the line L is given by 


r —y- t(ni x nj) (9.52) 


where, v is a point on the line. To determine this point, we first select a component 
of n, x n; which is non-zero. If the x component of n, x n» is non-zero, we will be 
able to find a point v — (0, y, z) on L by solving the following two equations obtained 
from the fact that this point lies on both the planes. 


yni, c zn; di =0 
yny + zn», + d2 = 0 (9.53) 


Having obtained the parameters defining the equation of the line L, the next step 
is to find the points of intersection of the line with the triangles. Consider an edge 
of the triangle 75 that intersects the line L as shown in Fig. 9.17. Such an edge 
can be identified as having vertices that give opposite signs for signed distances 
with respect to the plane T1. Let the vertices be qi, qj, with |D;|, |D;| denoting the 
magnitudes of the respective signed distances. 
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An edge of T» 


Fig. 9.17 Computation of the point of intersection of an edge of triangle T) with the line L 


The point of intersection w of the edge with the line L can be computed as 
follows: 


W — 4i 
w-4 1 Dl g, -4:) 
lq; — i 
zgi ngs (9.54) 
(|Di| + |Djl) 


The position of w corresponds to a value of the parameter f on the line L. This 
value can be obtained from Eq. 9.52, by substituting w for r, and choosing any 
component that has a non-zero value form; x m». Similarly, we can obtain another 
value of the parameter t for the other edge of T, that intersects L. Thus we have 
the interval [f,, t2] of the line segment on L obtained by its intersection with 75. 
Repeating the whole process for the triangle T, and computing the signed distances 
of its vertices with respect to "7, we can find the interval [s;, s2] intersected by the 
triangle on L. If both intervals overlap, then the two triangles intersect. 

Another approach to determining if the triangles overlap in three dimensions was 
recently proposed by Raabe et al. (2009). It uses the separating axis theorem (see 
Sect. 9.2.2), considering triangles as degenerate polytopes. If u,, u2, u3 and v1, Vo, 
v3 denote the vectors along the sides of the two triangles (e.g., u; = p2—p1) then 9 
separating axes directions can be formed as l= u; x v; (i, j — 1, 2, 3). The vertices 
pi and q; are projected on to each separating axis l, and intervals [ti, t2], [s1. 52] 
computed as follows: 


t =minj{p; elj, t = max;{p; el} 


sı = min;{q; el), s2 = max;{q; el} (9.55) 


If the intervals do not overlap for any of the separating axes directions, then 
the triangles do not overlap. We now consider the problem of determining if two 
triangles on the same plane intersect. Here we can use a simplified version of the 
separating axis theorem. Two non-overlapping triangles lying on a common plane 
can be separated by a line parallel to one of the six sides. This means that if the 
triangles do not overlap, their vertices can be projected on to disjoint intervals on a 
line orthogonal to one of the sides (Fig. 9.18). 
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Fig. 9.18 The separating 
axis theorem applied to a pair 
of co-planar triangles 


Separating line 


Line of projection 
(Separating axis) 


We need to consider only six separating axis directions given by u; x nm and 
v; X n (i= 1,2,3). The vertices of each triangle are projected on to the separating 
axis vector, and the projected intervals for the triangles computed as outlined above 
(Eq. 9.55). If any of the interval pairs are disjoint, then the triangles do not overlap. 


9.3 Bounding Volume Hierarchies 


Bounding volume hierarchies (BVH) were briefly introduced in Sect. 3.4 in the 
context of scene graphs. Using a BVH, the space enclosed by a collection of objects 
that are located close to each other in a scene can be hierarchically represented in 
terms of bounding volumes of subgroups within the collection, with the leaf nodes 
containing sufficiently small object parts and optionally their bounding volumes. 
In this representation, a parent node stores the combined bounding volume of a set 
of objects in the child nodes. A bounding volume for a single complex mesh object 
may also be subdivided into groups of bounding volumes of smaller components 
or parts of the mesh. A bounding volume hierarchy can therefore be viewed as a 
multi-scale representation of an object using bounding volumes. The hierarchical 
tree structure of bounding volumes is useful in significantly reducing the amount of 
pair-wise overlap tests. Bounding volume overlap tests are performed from the root 
of the tree to determine if the overall bounding volume intersects another primitive 
or another bounding volume. If the intersection test fails at this point, further tests 
using smaller sub-volumes stored in child nodes are not carried out. The complexity 
of intersection tests can thus be reduced using a well designed hierarchy. Some of 
the design considerations are 


* the efficiency and speed of computing the bounding volume parameters 
* the optimality of the computed bounding volumes 

* the amount of overlap between bounding volumes of sibling nodes 

* the frequency of updates 


The following two sections describe commonly used strategies for the construc- 
tion and traversal of bounding volume hierarchies. 
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Fig. 9.19 A bounding volume hierarchy using AABBs formed using top-down construction 


9.3.1 Top-Down Design 


The top-down construction of a bounding volume hierarchy starts with the formation 
of the bounding volume of an object, then recursively subdivides the mesh object 
into two nearly equal parts and stores their bounding volumes in the child nodes. 
The partitioning of the mesh is usually done by axis aligned splitting planes. Mesh 
primitives are assigned to either the left or the right child node based on the 
location of their centroids with respect to the splitting plane of the current node. The 
bounding volumes of the mesh sections stored in the child nodes are then computed. 
This process is repeated until a maximum level for the binary tree is reached, or until 
a node contains only a sufficiently small number of primitives. 

A bounding volume hierarchy constructed using AABB in the top-down fashion 
is shown in Fig. 9.19. At each node, the longest axis of the AABB is chosen, and the 
plane perpendicular to this axis passing through the centre of the AABB is selected 
as the splitting plane. For example, if the AABB is given by its mid point (Xmia, Ymia. 
Zmia) and the three half-width extents x», y,, z,, and if x, > y,, and x, > z,, then the 
plane parallel to the yz-plane through the midpoint is chosen as the splitting plane. 


9.3 Bounding Volume Hierarchies 259 


Fig. 9.20 An agglomerative clustering algorithm forms small cluster groups and merges them 
recursively based on pair-wise distances between existing clusters 


The triangles of the mesh whose centres have the x-coordinate less than xmia are 
assigned to the left child, and the remaining triangles to the right child. In Fig. 9.19, 
the yz-plane was chosen as the splitting plane at nodes 0 and 1, and the xz-plane at 
node 2. 

The top-down approach is particularly suitable for run-time construction of 
bounding volume hierarchies of large mesh objects using simple mesh partitioning 
strategies as outlined above. Therefore, most of the BVH algorithms use this 
method. 


9.3.2  Bottom-Up Design 


The bottom-up design approach is suitable for creating bounding volume hierarchies 
of a group of small objects that are located near each other. The construction of 
the tree starts with the bounding volumes of each object in the group which are 
then combined pair-wise, based on a distance measure. The bounding volume of the 
combined object is recomputed. This process is repeated until the bounding volume 
of the entire group is constructed at the root node. It is easy to see that methods 
like this run in parallel with agglomerative (bottom-up) hierarchical clustering 
algorithms that use pair-wise distances to form larger and larger groups of objects 
(Fig. 9.20). 

The bottom-up construction has the advantage that the bounding volume updates 
at a parent node can be done by simply merging together the bounding volumes 
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of the child nodes. Vertex data for objects or primitives stored in the leaf nodes 
therefore need not be copied to the parent nodes. This mechanism is ideally suited 
for a scene graph based implementation where object data is stored only in leaf 
nodes (see Sect. 3.4). The main drawback of this approach is that the merged 
volume may not be the minimal volume for geometries such as the sphere (see 
Fig. 3.15). For AABBs and k-DOPs however, merging two minimal volumes 
does yield a minimal volume. Accurate computation of bounding volume requires 
merged primitive/object information also to be stored at internal nodes. Some of 
the commonly used methods for computing the parameters of the merged bounding 
volumes are discussed below. 

Given two AABBs is x, y; yP, z% za, and ix. x. Dd , yË., 
z2, 2, the AABB for the combined set of points is given by (x,, xp, ya, Yb» 


i 1 2 1 2 ! 1 2 
Za, Zp} Where x,— min(xc . x) Xp — max (xQ)., xo), ya— miny |}, yy 


yp — max( Vere y, zm min(z/? , a) and z, = max(z\2., z),). If the AABBs 
are given in terms of their midpoints and half-width extents, the corresponding min- 
max values are computed and the merged volume parameters obtained as above. 
These parameters could then be converted back to the midpoint coordinates and 
half-width extents. 

The equations for computing the parameters of a sphere formed by merging 
together two spheres were given in Eq. 3.3. Merging two OBBs is done by collecting 
the vertices (Eq. 9.29) of both OBBs and computing a new OBB for these 16 points 
using the methods outlined in Sect. 9.1.3. 

Two k-DOPs can be easily merged if they share the same set of normal vectors. 
The method is exactly the same as that used for AABBs. If the k-DOPs are 
given by {dmin;®, dmax;, nj), {dminj®, dmax;”, nj), j=0...(k/2)—1, then the 
combined volume is (dmnj, dmx;, nj}, where dmn; = min(dmin;®, dminj®) and 
dmx; = max(dmax;, dmax;”), for all j. 


9.3.3 Collision Testing Using Hierarchy Traversal 


Collision between a primitive (e.g., a ray) and an object (or a group of objects) 
can be detected by traversing the bounding volume hierarchy of the object(s) from 
the root node in a recursive manner. At each step, an overlap test is performed by 
comparing the bounding volume stored at a node of the tree with the primitive. 
Such a method is useful in reducing the number of ray-object intersection tests in 
ray tracing algorithms and also in games, where for instance, a ray represents the 
direction of flight of a bullet. A recursive ray intersection algorithm is given as a 
pseudo code in Listing 9.2. In this code, node . volume represents a structure that 
stores bounding volume parameters at a node, and overlap() is a method that 
tests if the given ray intersects the volume. At a leaf node, the ray is tested with the 
object primitive stored at that node. 
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Listing 9.2 Ray-object intersection testing using a BVH 


findIntersection(Ray ray, Node node) { 

if(node--null) return false; 

if(lisLeaf(node)) { 
if (!overlap(ray, node.volume)) return false; 
findIntersection(ray, node.left); 
findIntersection(ray, node.right); 

} else [ 

if(!overlap(ray, node.object)) return false; 

else return true; 


PO DADO £0 hN rp 


Listing 9.3 Collision testing using two BVHs (Simultaneous descent) 


1. findIntersection(Node a, Node b) { 
Zia if((a--null)||(b-2-null)) return false; 
3. if(lisLeaf(a)) { 
4. if (!overlap(a.volume, b.volume)) return false; 
EM if(!isLeaf(b)) { 
6. findIntersection(a.left, b.left); 
P findIntersection(a.left, b.right); 
8. findIntersection(a.right, b.left); 
9. findIntersection(a.right, b.right); 
0. ) else ( 
ES findIntersection(a.left, b); 
12. findIntersection(a.right, b); 
13: ) 
14. ) else { 
5, if(lisLeaf(b)) { 
16. if(loverlap(a.volume, b.volume)) return false; 
Ta findIntersection(a, b.left); 
18. findIntersection(a, b.right); 
19. ) else ( //both a and b are leaf nodes 
20. if(loverlap(a.object, b.object)) return false; 
21. else return true; 
22. } 
23.—] 


Object-object intersection tests can be done using their respective bounding 
volume hierarchies. This process will require a systematic procedure that specifies 
how the trees must be traversed. A commonly used technique is to descend both 
hierarchies simultaneously using a depth-first approach. If the objects represented 
by the hierarchies intersect, the recursion will terminate at the leaf nodes of both 
trees. A pseudo-code for the method is given in Listing 9.3. In this code, node 
‘a’ belongs to the first tree, and ‘p’ belongs to the second tree. The procedure is 
called by passing the root nodes of the trees as parameters. It is assumed that the 
leaf nodes contain both primitive data (e.g., vertices of a triangle) as well as the 
bounding volume. 

An example showing two binary trees and the sequence of node comparisons 
used by the above method is given in Fig. 9.21. In this example, the primitives at 
nodes az and bg intersect. 
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Fig. 9.21 Anexample showing the simultaneous descend of two bounding volume hierarchies 


9.3.4 Cost Function 


The evaluation of the performance of a BVH-based method for collision detection 
is usually done with the help of a cost function. There are primarily three types of 
important operations performed: 


* Bounding volume updates are often required when the object undergoes transla- 
tional and rotational transformations. 

* Bounding volume overlap tests are performed when an internal node of a BVH 
is compared with an internal node of another BVH. 

* Primitive intersection tests are performed at the leaf nodes of a bounding volume 
hierarchy. 


The cost function is the aggregate of the costs for each of the above operations, 
and is defined as 


F — nC, t n,C, t ny Cy (9.56) 


where C, is the average cost of updating a bounding volume, n, the number of 
bounding volumes updated, C, the average cost of testing if a pair of bounding 
volumes overlap, n, the number of bounding volume overlap tests performed, C, is 
average cost of testing if a pair of primitives intersect, and n, the number of primitive 
intersection tests performed. The value of n, will be large if several bounding 
volume overlap tests are done at internal nodes even when the primitives at the 
leaves do not intersect. Therefore, selecting a tight fitting bounding volume for the 
construction of the BVH helps in bringing down the value of n,. However, tight 
fitting bounding volumes such as convex hulls generally have a higher value of C, 
compared to AABBs and spheres. Reducing the number of internal nodes results 
in a reduction in n,. The cost functions C,, C,, and C, may be defined based on 
the number of geometrical computations such as vector products involved in each 
operation. 
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9.4 Spatial Partitioning 


In this section, we will look at some of the important spatial partitioning tree 
structures useful for collision detection. If a scene consists of n objects that 
can move and potentially collide with each other, the number of pair-wise tests 
required is n(n—1)/2. Spatial partitioning techniques help to subdivide the entire 
three-dimensional space occupied by the objects into a set of regions. Using 
such techniques we can quickly determine if objects in a group are not likely to 
intersect objects in another group (because they belong to disjoint regions), and 
thus eliminate the need for performing pair-wise tests between members of the 
two groups. A region based grouping of objects such that a member within any 
group is guaranteed not to intersect any member belonging to any of the other 
groups is called broad-phase collision detection. The grouping also suggests that 
objects within the same group may potentially collide. Pair-wise intersection tests 
using methods discussed in the previous section are used only to detect collision 
between objects within each group. Pair-wise tests using both bounding volumes 
and primitives are collectively called the narrow phase collision detection methods. 
Figure 9.22 provides an example showing the reduction in pair-wise tests achieved 
by a grid-based partitioning of the space into disjoint regions. 


9.4.1 Octrees 


An octree defines a regular partitioning of an axis-aligned cube into eight equally 
sized sub-cubes (octants) by dividing the cube by half along each of the axis. 
Each sub-cube is again divided into eight sub-cubes in a similar manner. The process 
of recursively subdividing the cube continues until a pre-specified maximum for 
the depth of the tree has been reached, or the cube size has become smaller than a 


Number of pair-wise tests — 28 Number of pair-wise tests — 7 


Fig. 9.22 An example showing the reduction in the number of pair-wise tests using spatial 
partitioning 
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Fig. 9.23 A subdivision of a cube into octants. Cube indices are assigned based on the positions 
of the sub-cubes relative to the midpoint of the parent 


pre-specified minimum value. If the number of primitives within a cube is less than a 
threshold value, and in particular, if a cube is empty, it is not subdivided further. The 
initial cube encloses the whole three-dimensional space occupied by the objects, and 
forms the root node of the octree. Each internal node of an octree has exactly eight 
children corresponding to the sub-cubes of the parent node. A node is subdivided 
only when necessary. 

The geometrical information about a cube is stored in terms of the position of its 
midpoint and size. Each cube is also assigned a unique index (Fig. 9.23). The root 
node has index 0, and its children have indices 1—8. We use the notation i:(x., ye, 
Ze, S) to denote the cube with index i, centre (xe, ye, Zc) and side length s. When the 
cube is subdivided into eight octants, its children are stored as follows: 


8i +1: (xc — (8/4), ye — (8/4), zc — (8/4), 5/2) 

8i +2: (xe — (8/4), ye — (5/4), Ze +(8/4), 5/2) 

8i +3: (xc — (8/4), ye 5/4), z.— (5/4), 5/2) 

8i +4: (Xe — (5/4), yc (5/4), ze- (5/4), 8/2) 

8i 4-5: (xcM- (5/4), ye—(s/4), zc — (5/4). 5/2) 

8i +6: (xcd (5/4), yce—(5/4). zc +(8/4), 5/2) 

8i +7: (xc (5/4), yc (5/4), ze=(s/4), 5/2) 

8i +8: (Xe +(5/4), yet- (s/4). Ze + (s/4), 5/2) (9.57) 
Using the above index representation, a node among a group of child nodes has 


an index k = 8i + j, j= 1..8. The parent's index i can be obtained from k using the 
integer division (k—1)/8. Applying the transformation b = (k —1) mod 8, we get 
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2 6 


Fig. 9.24 Space partitioning using octrees 


a value between 0 and 7 which can be represented using 3 bits. If the lowermost 
bit of b is 1, it indicates that the node is located towards the positive z direction 
from its parent. A bit value 0 indicates that the node is in negative z direction. The 
middle bit similarly gives the node’s relative location along y direction (1: positive, 
0: negative). The highest bit gives the x-direction. As an example, if a node’s index 
is 30, its parent has the index 3, and b=5 (—101 binary). The x and z coordinates 
of the node's centre are greater than that of its parent, while the y-coordinate is 
less. We can also use the index information to compute the bounding planes of a 
cube. Every cube except the root is bounded on three sides by axis-aligned splitting 
planes through the centre of its parent. These planes are x = xe, y= Ye, Z = Ze, where 
(Xes Yes Zc) is the midpoint of the parent cube. The remaining three bounding planes 
are given by the bit values of b. For the example given above, where b has a binary 
value 101, the three remaining bounding planes are x = x, + (s/2), y = ye — (s/2) 
and z= Ze + (s/2). Note that s is the size of the parent, not the sub-cube under 
consideration. A cube’s six bounding planes can also be directly obtained from its 
own centre and size, but the former method uses three common planes for every 
child of a given parent node, and requires only three additional planes for each 
child. 

Figure 9.24 shows the subdivision of a three-dimensional volume containing a 
cylindrical object. The indexing of this volume using an octree is also shown in the 
figure. The initial volume is divided into eight octants as the volume is non-empty. 
In the next step, the non-empty volume with index 4 is further subdivided into eight 
octants. The indices of the children of node 4 have values from 33 to 40 (8*4 +J, 
j= 1..8). The object intersects the sub-cubes 35 and 36, and is therefore included in 
both these nodes. Further subdivision of these nodes is likely to produce intersection 
of the object with all sub-cubes, and therefore may not be carried out. 

A top-down traversal of the octree from the root node is often performed to locate 
the smallest cube of the tree that contains a given point P = (Xp, Yp, Zp). If the current 
node is i:(x., Ye, Ze, S), we can identify the next node containing P using its index k 
computed as given in Listing 9.4. 
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Listing 9.4 Computation of the index of a child node that contains a 


point P 
1. Given: Node i: (Xe, Yo, Zo, S) containing 
a point P = (Xp, Yp, Zp). 
2 Output: Index k of the child node containing P 
3 if (Xp S x; bl = 0; else b1^-2 1; 
4 if (yp S ya) b2 = 0; else b2 = 1; 
5 if (zp S z;) b3 = 0; else b3 = 1; 
6 k = 8*i + 4*b, + 2*b + b; + 1 ; 


Fig. 9.25 An example showing quadtree subdivision and traversal 


The two-dimensional equivalent of an octree is called a quadtree. A quadtree 
represents subdivisions of a square using four child nodes. For a quadtree, the 
computation of the index k in Listing 9.4 will use only the x and y coordinates. The 
corresponding formula for the index of the child node is k = 4*i + 2*b, + b2 + 1. 
A group of four child nodes will thus have indices of the form 4i + j (j = 1..4). The 
position of a square relative to its parent is south-west if j — 1, north-west if j — 2, 
south-east if j = 3, and north-east if j = 4. This subdivision scheme establishes the 
method for quadtree descent, illustrated in Fig. 9.25. Note that a point on a vertical 
splitting line gets assigned to the square on its left, and a point on a horizontal 
splitting line gets assigned to the square below it. Note also that empty squares are 
not subdivided further. A similar traversal algorithm can be formulated for an octree. 

We now use the octree traversal algorithm for finding the leaf nodes where a 
three-dimensional object is stored (as in Fig. 9.24). The object is stored in every 
leaf node it overlaps. To simplify the problem, we use the AABB of the object 
given in terms of the parameters {Xmin, Xmax, Ymin» Ymax; Zmin: Zmax]- We descend the 
octree using the points P = (Xmin, Ymin, Zmin) and Q = (Xmax, Vmax» Zmax) aS discussed 
in the previous paragraph, and find the lowest node containing both P and Q. This 
internal node represents the minimum volume of the subdivision that contains the 
entire AABB, and hence the entire object. The children of this node are recursively 
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examined to check if any of the sub-cubes overlap the given AABB. Since a cube 
itself is an AABB, we can use the AABB intersection test in Sect. 9.2.1 for this 
purpose. If a node does not overlap the given AABB, its children need not be tested. 
This process is repeated until we reach the leaf nodes and return the indices of 
those leaf nodes that overlap the AABB. This information is vital for the broad- 
phase collision detection of the object. The object can potentially intersect only 
other objects or primitives stored at these leaf nodes. 

A bounding sphere enclosing an object can also be stored in an octree using a 
procedure similar to that described above. An octree node i:(x., Ye, Zc, s) overlaps a 
bounding sphere with centre at position p = (Xp, yp, Zp) and radius r if and only if 
the distance between the centre of the sphere and the centre of node (cube) is less 
than or equal to the radius of the sphere. To avoid the square-root computation, this 
condition is usually expressed as follows: 


(xc — xp)? +e- Yp) + (z — Zp)” ar (9.58) 


In the next section, we will look at a recursive binary partitioning tree that is 
comparatively easier to traverse than an octree. 


9.4.2 k-d Trees 


A k-dimensional tree, also known as a k-d tree, represents a subdivision hierarchy 
that is generated by splitting a volume along one axis at a time, and changing the 
axis in a cyclic fashion at each subdivision step. A three-dimensional volume is 
commonly split first along the x-axis using a plane parallel to the yz-plane, then 
along the y-axis using a splitting plane parallel to the xz-plane, and then along the 
z-axis. The process continues in the next step by again splitting along the x-axis. In 
our discussion we will assume that the splitting planes are chosen in the x-y-z order. 
A k-d tree is a special case of a binary space partitioning (BSP) tree where splitting 
planes can have arbitrary normal directions. 

At the root level, every point that has the x-coordinate less than or equal to 
a chosen value xoo is put into the left child node, and points with x-coordinate 
greater than xoo goes to the right child. The points in the child nodes are split using 
y-coordinate values yo; and y;;. Choosing the splitting values xoo, yo, y11, etc., as 
the median values of the points within the node gives nearly equal number of points 
in both child nodes, and results in a well balanced tree. An example showing the 
binary space partitioning of a planar region using a 2-d tree is shown in Fig. 9.26. 
The splitting values are shown inside the nodes. The first subscript gives the node's 
position within the same level, starting from 0 for the leftmost node. The second 
subscript indicates the level the node is in. 

A three-dimensional k-d tree stores the minimum and maximum extents of the 
volume it represents at the root using the six coordinate values (Xmn, Xmx, Ymn> Ymx: 
Zmn» Zmx}- The root node also stores the value xoo of the splitting plane used at 
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Fig. 9.26 A binary partition of a two-dimensional region using a 2-d tree 


Listing 9.5 Sequential traversal of a k-d tree for locating a point P 


O 00 — OY O1 4 QC) ND ES 


ceo 100) 0h. mnDÍco«:- 


Given: A point (px, py, pi). 


Output: Leaf node containing P 

node = root 

axis — 1 //x-axis 

If P is outside the AABB [xq Xmx, Vmnr Yuxr Zane Zmx} 
return null // P outside the volume 

while (!node.isLeaf()) //not a leaf node 

{ 
if (axis == 1) { //split along x 


if(px < node.value) node = node.left 
else node - node.right 
} 
else if (axis == 2) { //split along y 
if(py < node.value) node = node.left 
else node = node.right 
} 
else if (axis == 3) { //split along z 
if(pz < node.value) node = node.left 
else node = node.right 


} 

axis = axis + 1 

if (axis > 3) axis =1 
} 


return node 


that level. The AABB of the left child is given by {Xmn, X00, Ymn, Ymx> Zmn> Zmx} 
and that of the right child by {x00, Xmx. Ymn» Ymx» Zmn» Zmx}. Child nodes generally 
store only splitting plane values, but algorithms such as the ray intersection test 
discussed below require AABB parameters of leaf nodes. Both the construction and 
the traversal of a k-d tree are done in a top-down fashion, starting from the root node 
that either represents a three-dimensional scene or the axis-aligned bounding box of 
a group of objects. The traversal of a k-d tree to leaf node where a point P can be 
either inserted or located, follows the pattern of the well-known sequential binary 


tree search algorithm. The pseudo-code for the method is given in Listing 9.5. 
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Fig. 9.27 Ray intersections with a volume represented by a k-d tree 


The broad-phase collision detection of objects is done by first identifying the 
leaf nodes of the k-d tree where an object’s bounding volume is stored. Every node 
of a k-d tree represents an axis-aligned box, and therefore the methods given in the 
previous section can be directly used locate the positions of an AABB or a bounding 
sphere (see Eq. 9.58) within a k-d tree. 

A k-d tree is also used for ray tracing acceleration, as it can effectively restrict 
the computation of ray intersection tests along the direction of the ray. The root 
of the k-d tree represents the AABB of the scene of objects to be ray traced. The 
primary ray typically originates from a view point outside the scene. A secondary 
ray, on the other hand may originate from a point inside the scene. In Fig. 9.27, 
ray-1 originates at Po, enters the AABB of the root node of the k-d tree at P; and 
exits the scene at P4. Ray-2 originates at a point Q; within the scene and exits the 
volume at Q3. In the case of ray-1, we can compute the position of P; as well as 
the entry and exit distances PoP}, PoP4 using the parametric equation of the ray 
and the equations of the bounding planes of the AABB (see Sect. 9.2.1). Using the 
method in Listing 9.5, we can identify the leaf node of the k-d tree where P, (or Qj 
for ray-2) is located. The ray is tested for intersection with the objects stored at this 
leaf node. If an intersection occurs, the point of intersection closest to the origin of 
the ray is returned. Otherwise, the ray is compared with the AABB of the leaf node 
and the next intersection point P» with the current node is computed. This point is 
extended further by a small amount ¢ along the direction of the ray to get a point P»' 
that lies well within another node of the k-d tree: 


Py! = P, + ed (9.59) 
where d is the unit vector along the ray direction. The k-d tree is traversed again 


from the root to identify the leaf node to which P?’ belongs. The objects in this 
node are then compared with the ray to determine if there is an intersection, and if 
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Listing 9.6 Sequential ray intersection test using a k-d tree 


1. Given: A ray with the first point P, inside the root’s 
AABB. 

2. Output: Closest point of intersection with the ray 

3. Point p = B 

4. Get the leaf node containing p (Listing 9-5) 

5. If ray intersects objects in the node, return the 


closest point of intersection. 
6. Compute the point of intersection p2 of the ray 
with the current node's AABB 
7. p = p2 shifted by a small distance along the ray 
8. if p is inside the root's AABB, go to 4 
9. else return false //no intersection 


there is no intersection, the process is continued by extending the ray to the next 
cell, and so on until the ray exits the AABB of the root node. The algorithm visits 
all leaf nodes intersected by the ray in a near to far order. Objects and primitives in 
the remaining leaf nodes are not compared with the ray. The pseudo-code for this 
method of sequential ray intersection test is given in Listing 9.6. 

A k-d tree based spatial partitioning is useful in finding the point closest to a 
given point P within a three-dimensional volume. The location of the closest point 
gives information about the object which could most likely collide with the object 
or the bounding volume containing P. 

As shown in the two-dimensional example in Fig. 9.28, the algorithm for finding 
the nearest neighbour of P begins with the traversal of the k-d tree to the leaf node 
containing P and finding the closest point to P within that leaf node. The squared 
value of the distance between the two points is stored as the current minimum value 
of r?. The position of P and the value of 7? together are used to compare the sphere 
centred at P with the AABBs of other leaf nodes of the k-d tree using Eq. 9.58 for 
possible overlap. If a leaf node overlaps the sphere, the squared distances of points 
in that node from P are computed and if a value lower than the current minimum 
is found, then the value of 7? is updated with the lowest found in that node. The 
process continues as shown in Fig. 9.28, until all nodes that overlap the sphere have 
been examined. The point that generated the minimum value for 7” is selected as the 
nearest neighbour of P. The sphere-AABB overlap test excludes a large number of 
points that are separated from P by a distance greater than r from being compared. 

We saw earlier that both an octree and a k-d tree may store the same object in 
several leaf nodes if the object overlaps the volume of those nodes. For an octree, the 
splitting planes are fixed, but in a k-d tree, we can select the position of the splitting 
plane. Several heuristics, such as the surface area heuristics have been proposed 
in the literature to minimize the amount of object overlap at leaf nodes. The next 
section introduces a subdivision structure that combines the desirable attributes of a 
k-d tree and a bounding volume hierarchy. 
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Fig. 9.28 A sequence of computations performed on a k-d tree to find the nearest neighbour of a 
point P 


9.4.3 Boundary Interval Hierarchy 


A bounding interval hierarchy is a structure similar to a k-d tree, but uses two parallel 
partitioning planes for each node. For a given node, the plane perpendicular to and 
passing through the midpoint of the longest axis of the node's AABB is first chosen 
as the splitting plane. Assume that this axis is in the x-direction, and the position 
of the splitting plane is x9. The AABBs of the objects within the node's volume 
are then sorted along this axis. The objects whose AABBs have all x-coordinates 
less than or equal to xo are assigned to the left child. AABBs that are entirely 
on the right of the splitting plane are assigned to the right child. Objects whose 
AABBs intersect the splitting plane are classified as belonging to the left or right 
child depending on which side of the splitting plane the AABBs have maximum 
overlap. The left partitioning plane is then defined using the maximum value of 
the x-coordinates of the AABBs belonging to the left child, and the right plane is 
defined using the minimum value of the x-coordinates of the AABBs belonging to 
the right (Fig. 9.29). The process continues by splitting each child node along the 
longest axis and defining two partitioning planes along that axis. A node containing 
only a single object is not subdivided further. 
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Fig. 9.29 Partitioning of objects into “left” and "right" using two parallel partitioning planes 


The primary differences between a k-d tree and a bounding interval hierarchy are 


listed below. 


A k-d tree selects axes cyclically and defines a perpendicular splitting plane at the 
median point along the current axis direction, or alternatively uses a heuristic to 
position the splitting plane. A bounding interval hierarchy uses the longest axis 
of the current AABB, and the splitting plane is always positioned at the midpoint. 
In a k-d tree, the AABBs of the objects in a node are not sorted. A bounding 
interval hierarchy sorts AABBs along each axis. This speeds up the process of 
repeated classification of objects as left or right of splitting planes along the same 
axis. 

A k-d tree stores only the splitting plane position and optionally the parameters 
of the node AABB. A bounding interval hierarchy requires the positions of two 
partitioning planes and the axis information to be stored. 


By using two partitioning planes, a bounding interval hierarchy is able to classify 


each object in a node volume uniquely as either "left" or "right", without the 
need for placing an object that overlaps the splitting plane in both child nodes. A 
clear separation of objects is thus achieved, and the AABBs of the child nodes are 
closely aligned with the object AABBs within the nodes. The interval hierarchy 
thus provides a hierarchy of axis aligned bounding volumes and also a spatial 
ordering similar to that of the k-d tree. Bounding interval hierarchies have been 
found particularly useful for real-time ray tracing. 
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9.5 Summary 


This chapter has covered the main aspects of collision detection algorithms includ- 
ing the representation of objects using bounding volumes, intersection tests between 
primitives and bounding volumes (BV) and hierarchical structures that are useful 
for minimizing the amount of pair-wise object/primitive comparisons required in 
overlap tests. 

The important methods discussed in the context of bounding volume construction 
are the Welzl’s algorithm for computing the minimum bounding sphere, the 
computation of oriented bounding boxes, and the incremental construction of 
three-dimensional convex hulls. Algorithms for primitive-BV intersection tests and 
BV-BV intersection tests have been presented in detail. The separating axis theorem 
and the slab-based method are extremely useful for intersection tests involving 
oriented bounding boxes. 

The chapter also presented methods for the construction and traversal of bound- 
ing volume hierarchies. Spatial partitioning structures such as the octree and the k-d 
tree are useful for broad-phase collision detection as well as ray tracing algorithms. 
Both structures use axis aligned splitting planes to facilitate efficient computation of 
ray intersection tests. The bounding interval hierarchy is a structure that combines 
the features of a bounding volume hierarchy and a k-d tree. 
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The section Chapter9/Code on the companion website contains the following 
programs implementing and demonstrating the working of key algorithms discussed 
in this chapter. 


1. BoundingCircle.cpp 


Additional files: 
none 


The program demonstrates the working of Welzl’s algorithm for computing 
the minimum bounding circle for a set of points on the plane of display. Points 
are inserted by the user interactively using left mouse clicks. As each point is 
added, the minimum bounding circle is updated as discussed in Sect. 9.1.2. Press 
‘c’ to refresh the screen. 
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2. InsertionHull.cpp 


Additional files: 
none 


The program demonstrates the insertion hull algorithm for incrementally 
constructing the convex hull of a set of points. Points are added interactively 
using left mouse clicks. As each point is added, the convex hull is updated using 
the algorithm discussed in Sect. 9.1.5. Press ‘c’ to refresh the screen and to start 
over again. 


3. BVH AABB.cpp 


Additional files: 
Mesh.h 


The program loads a mesh file *object.off" and displays the bounding volume 
hierarchy constructed using AABBs (Sect. 9.3.1). It also shows the position of a 
ray (whose parameters are defined in the program) relative to the AABB, and the 
intersection points if the ray intersects the AABB. Press the ‘z’ key to go to left 
child of the current node and *x' to the right child. Use left and right arrow keys 
to change the view direction. 


4. BVH Sphere.cpp 


Additional files: 

Mesh.h Ja x 
Mesh.cpp g 
object.off 


< 
ved A 


The program uses a cluster of triangles to demonstrate the bottom-up con- 
struction of a bounding volume hierarchy of spheres (Sect. 9.3.2). Clicking the 
left mouse button anywhere within the window causes the intersected bounding 
circles to be highlighted. Triangles that are excluded from intersection tests are 
also highlighted in grey color. 
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5. KdTree.cpp 


Additional files: 
Mesh.h 
Mesh.cpp 


object.off 


The program generates a set of randomly distributed points and displays 
the k-d tree partitioning of the two-dimensional space. The program also 
demonstrates the nearest neighbour algorithm using the traversal of the k-d tree. 
The user inputs a point using left mouse click, and presses space bar to initiate the 
k-d tree search for the closest point. All points visited by the traversal algorithm 
are highlighted. The point found closest to the input point is also marked by a red 
coloured line segment connecting the two points. 


9.7 Bibliographical Notes 


Two important reference books for learning and developing collision detection 
algorithms are Ericson (2005) and van den Bergen (2003). These books deal with 
all aspects of collision detection including primitive tests, bounding volumes and 
acceleration algorithms. Books on real-time rendering (Moller et al. 2008) and 
game engine design (Eberly 2007; Eberly 2010) also give an extensive coverage of 
collision detection techniques. Collision detection is an area where a large number 
of computational geometry algorithms are used. Methods for pair-wise intersection 
tests, point inclusion tests, proximity tests and the construction of convex hulls 
are discussed in detail in de Berg (2000), O' Rourke (1998) and similar books on 
computational geometry. 

Early development in collision detection methods were based mainly on prim- 
itive intersection tests and spatial partitioning. Samet's books (1990a, b) provided 
a comprehensive guide to spatial data structures. Toussaint (1983) introduced the 
rotating calipers algorithm, and Welzl (1991) the method for computing the smallest 
bounding disc. In the late 1990s some fundamental papers on bounding volume 
hierarchies using AABBs (van den Bergen 1997), OBBs (Gottschalk et al. 1996), 
triangle intersection tests (Moller 1997), and k-DOPs (Klosowski et al. 1998) 
appeared. 

Three recent publications on the use of hierarchical structures for ray tracing 
are Wald and Havran (2006), Cline et al. (2006), and Hapala and Havran (2011). 
Bounding interval hierarchies are introduced and their applications to real-time ray 
tracing discussed in Wachter and Keller (2006). 


276 9 Collision Detection 


References 


Cline, D., Steele, K., & Egbert, P. (2006). Lightweight bounding volumes for ray tracing. Journal 
of Graphics Tools, 11(4), 61—71. 

de Berg, M. (2000). Computational geometry: Algorithms and applications (2nd rev. ed.). New 
York: Springer. 

Eberly, D. H. (2007). 3D game engine design: A practical approach to real-time computer graphics 
(2nd ed.). Amsterdam/London: Morgan Kaufmann. 

Eberly, D. H. (2010). Game physics (2nd ed.). Burlington: Morgan Kaufmann/Elsevier. 

Ericson, C. (2005). Real-time collision detection. Amsterdam/Boston: Elsevier. 

Gottschalk, S., Lin, M. C., & Manocha, D. (1996). OBB-Tree: A hierarchical structure for rapid 
interference detection. In: Computer graphics (SIGGRAPH), New Orleans (pp. 171-180). 
Hapala, M., & Havran, V. (2011). Kd-tree traversal algorithms for ray tracing. Computer Graphics 

Forum, 30(1), 199-213. 

Kay, T. L., & Kajiya, J. T. (1986). Ray tracing complex scenes. In: Proceedings of computer 
graphics SIGGRAPH-86, Dallas (pp. 269—278). 

Klosowski, J. T., Held, M., et al. (1998). Efficient collision detection using bounding volume 
hierarchies of k-DOPs. IEEE Transactions on Visualization and Computer Graphics, 4(1), 
21-36. 

Moller, T. (1997). A fast triangle-triangle intersection test. Journal of Graphics, GPU and Game 
Tools, 2(2), 25-30. 

Moller, T., Haines, E., & Hoffman, N. (2008). Real-time rendering (3rd ed.). Wellesley: A.K. 
Peters. 

O’Rourke, J. (1998). Computational geometry in C (2nd ed.). Cambridge: Cambridge University 
Press. 

Raabe, A., Tietjen, T., & Anlauf J. K. (2009). An exact and efficient triangle intersection 
test hardware. International conference on computer graphics theory (GRAPP-09), Lisbon, 
Portugal (pp. 355-360). 

Samet, H. (19902). Applications of spatial data structures: Computer graphics, image processing, 
and GIS. Reading: Addison-Wesley. 

Samet, H. (1990b). The design and analysis of spatial data structures. Reading: Addison-Wesley. 

Toussaint, G. (1983). Solving geometric problems with the rotating calipers. In: 2nd IEEE 
Mediterranean Electrotechnical Conference (MELECON ’83), Athens. 

van den Bergen, G. (1997). Efficient collision detection of complex deformable models using 
AABB trees. Journal of Graphics Tools, 2(4), 1-14. 

van den Bergen, G. (2003). Collision detection in interactive 3D environments. San Francisco: 
Morgan Kaufmann. 

Wachter, C., & Keller, A. (2006). Instant ray tracing: The bounding interval hierarchy. In: Euro- 
graphics symposium on rendering, 26-28 June 2006, Cyprus (pp. 139-149). 

Wald, I., & Havran, V. (2006). On building fast kd-trees for ray tracking and on doing that in 
O(NlogN). In: IEEE symposium on interactive ray tracing, 18—20 Sep 2006, Salt Lake City, 
Utah (pp. 61-69). 

Welzl, E., et al. (1991). Smallest enclosing disks. In H. Maurer (Ed.), New results and new trends 
in computer science (Lecture notes in computer science, Vol. 555, pp. 359-370). New York: 
Springer. 


Appendices 


Appendix A: Geometry Classes 


This section gives a description of the methods in Point3, Vec3, Triangle and 
Matrix classes. The static relationships between the classes are shown in Fig. A.1. 


Fig. A.1 Relationships . : : 
between geometry classes Triangle 3 Points ecce] Matrix 
A 
Vec3 


A.1 Point3 Class 


Fields 
protected: 
static float EPS; 
public: 
float x, y, z, h; 
Description: 
The data members of the class store the coordinates of a point. For 
programming convenience, the coordinates are declared as public, 
so that they can be directly accessed without the need for getter 
methods. The fourth component -h is initialized to 1 for points and 
0 for vectors. This component is not used for computing the norm, 
scalar product, and other operations such as addition, subtraction and 
negation. 
The static field EPS is a threshold used for checking if a floating point 
value is close enough to zero. Its value is set to 1.E-6. 
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Constructors 
public: 
Point3(float x, float y, float z = 0) 
x(x), _y(y), z(z), h(1.0) {} 
Point3() 
: x(0.0), y(0.0), z(0.0), h(1.0) () 
Description: 


The first constructor sets the values of x, y, z coordinates using its 
arguments. The h value is initialized separately to a default value 1.0. 
The second no-argument constructor initializes a point to the origin. 

Distance computation 

float norm() const; 
Description: 

This method computes the distance to a point from the origin or the 
length of a vector. 

Addition and subtraction 


Point3* add(const Point3* p) const; 
Vec3* subtract (const Point3* p) const; 


Description: 
The add method adds the x, y, z coordinates of the current point with 
the corresponding coordinate values of p, and produces a new point. 
The A coordinate values are not added. The resulting point is assigned 
an h value 1.0. This method is overridden in the subclass Vec3 which 
sets the h value to 0. The subtract method similarly subtracts the 
coordinates of p from that of the current point and produces a vector 
originating at p. 

Negation 
Point3* negate() const; 
Description: 


This method negates the x, y, z coordinates of the current point. The A 
coordinate value is not negated. 


Scalar multiplication 


Point3* scalarMult(float c) const; 
Description: 
This method scales the x, y, z coordinates of the current point by the 
constant factor c, and produces a new point. The resulting point is 
assigned an A value 1.0. 
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Conversion to standard form 


Point3* standard(); 


Description: 
This method converts the current point to standard form by applying 
the transformation: (x, y, z, h) => (x/h, y/h, z/h, 1), provided h Æ 0. 
Output 
void print() const; 
Description: 


This method prints the x, y, z, ^ coordinates of the current point or 
vector. 


A.2 Vec3 Class 


The Vec3 class is a subclass of Point3. 
Fields 


private: 
static float RADTODEG; 
public: 
static const Vec3* X AXIS; 
static const Vec3* Y AXIS; 
static const Vec3* Z AXIS; 
Description: 
The static field RADTODEG stores the multiplication factor (— 7/180) 
for conversion from radians to degrees. 
The static fields X AXIS, Y AXIS, Z AXIS store respectively the 
orthogonal basis vectors (1, 0, 0), (0, 1, 0) and (0, 0, 1). 


Constructors 
public: 
Vec3(float x, float y, float z = 0) 
Point3(x, y, z){_h = 0;} 
Vec3() {_h = 0; 


} 
Description: 


The constructors invoke the base class constructors and additionally 
set the value of _h to 0. 


Dot and cross products 


float dot(const Vec3* v) const; 
Vec3* cross(const Vec3* v) const; 
Description: 
The dot method returns the dot product of the current vector and v. 
The cross method returns a vector as the result of the cross product 
between the current vector and v. 
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Vector normalization 


void normalize(); 


Description: 
The method converts the current vector to a unit vector by dividing its 
components by the length of the vector. 


Reflection of a vector 


Vec3* reflect(const Vec3* n) const; 


Description: 
The method computes the reflection of the current vector with respect 
to n using the formula in Eq. 2.5. 


Computation of angles 


float angle(const Vec3* v) const; 
float angle2(const Vec3* v) const; 
float signedAngle(const Vec3* v, const Vec3* w) const; 


Description: 

The method angle first converts the current vector and the input 
vector v to unit vectors, and then computes the angle between them 
using the inverse cosine of the dot product of the two vectors. The 
value is returned in degrees in the range [0, 180]. The method ang1e2 
uses both dot and cross products to compute the angle using the 
formula 0 = tan !(|u x v|, u *v). The singedAngle method uses 
Eq. 2.6. to compute the signed angle between the current vector and v 
with respect to a given view direction w. 


A.3 "Triangle Class 


Fields 
private: 
const Points: * a, "o D, = G; 
Description: 
The data members of the class store references to the three vertices of 
a triangle. 


Constructors 
public: 
Triangle(const Point3* a, const Point3* b, const Point3* c) 
ala), b(b), c(c) {} 


Description: 
The non-default constructor requires three references to objects of the 
Point3 class. Methods of the class use these points as vertices of the 
triangle. 
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Computation of area 
float area() const; 
float signedArea2D() const; 
float signedArea3D(const Vec3* w) const; 


Description: 

The method area computes the area of the current triangle using 
the cross product of vectors along two edges as given in Eq. 2.3. 
The method signedArea2D returns the area of the triangle which 
has a negative sign if the angle between the normal direction and 
the z-axis is greater than 90°. The function signedArea3D uses 
a similar approach by using a user specified vector w instead of the 
z-axis (Eq. 2.8). 


Computation of barycentric coordinates 


Point3* barycentricCoords(const Point3* p) const; 
Description: 
This method computes the barycentric coordinates of the point p with 
respect to the current triangle using area ratios as given in Eq. 2.48. 


Barycentric mapping 
Point3* barycentricMap 
(const Point3* p, const Triangle* t) const; 
Description: 
A point p and a triangle t containing p are given. This method 
computes the image of p in the current triangle as shown in Fig. 2.12. 


Point inclusion test 
bool isInside(const Point3* p) const 
Description: 


This function uses barycentric coordinates to determine if a point p 
lies within and on the plane of the current triangle. 


Bilinear interpolation 
Point3* bilinear(int kl, int k2) const 
Description: 
This method returns a point computed using the bilinear interpolation 
formula in Eq. 2.45. The arguments kı and kz must satisfy the 
condition that kı, k2, and kı + ko, all have values in the range [0, 1]. 


OpenGL drawing 
void draw() const; 


Description: 
This method draws the current triangle using OpenGL functions. 
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A.4 Matrix Class 


Fields 
private: 
float v[4] [4]; 


Description: 
The Matrix class represents the data structure for a 4x 4 matrix, 
with its values stored in the two-dimensional array .v. 
Constructors 
public: 
Matrix() 
Matrix(float values[][4]) 


Matrix(const Vec3* u, const Vec3* v, const Vec3* w) 
Description: 
The default constructor initializes the matrix with the identity matrix. 
The second constructor initializes the matrix using a two-dimensional 
array of values. The values are stored in row-major order. The third 
constructor forms the matrix using three vectors u, v, w as the first 
three columns of the matrix. The last column has values 0, 0, 0, 1. 


Identity matrix 


void identity(); 
Description: 
This method resets the current matrix to the identity matrix. 
Accessing matrix elements 


float valueAt(int i, int j) const; 


Description: 
This is a getter method that returns the value of v [i] [j]. 


Setting matrix elements 
void setValue(int i, int j, int value); 
Description: 


This is a setter method that replaces the value of v [i] [j] with 
value. 


Transpose and inverse 
void transpose(); 
void inverse(); 
Description: 

The method transpose modifies the current matrix by replacing 
it with its transpose. Similarly inverse replaces the current matrix 
with its inverse, provided the matrix is invertible. If the determinant of 
the current matrix is 0, it is not changed. 
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Point transformation 


Point3* transform(const Point3* p); 


Description: 
This method returns a new point computed by pre-multiplying the 
point p by the current matrix. 
Matrix copy 


Matrix* copy(); 


Description: 
Often it is required to keep a copy of the current matrix before 
computing its transpose or inverse. This method returns a reference 
to a new matrix object that contains the same values as the current 
matrix. 


Output 


void print(); 


Description: 
This method prints the values of the current matrix in 4 x 4 format. 


Appendix B: Scene Graph Classes 


This section gives an outline of the methods in the scene graph classes. A description 
of these classes can be found in Sect. 3.5. The static relationships between the 
classes are shown in Fig. B.1. 


Fig. B.1 Relationships 
between scene graph classes GroupNode 


D 


ObjectNode| |CameraNode| | LightNode 


B.1 GroupNode Class 


Fields 


private: 
list«GroupNode*» children; 
protected: 
GroupNode* parent; 
float tx, ty, tz, _angleX, _angleY, _angledZ; 


Description: 

The list variable children stores references to the children of the 
current group node, in an STL list structure. The access level for this 
variable is declared as private since all subclasses are leaf nodes 
that do not have children. Each group node also stores a reference to its 
parent node in the variable parent. It has a value NULL for the root 
node. Every group node also stores the translation parameters tx, 
-ty, -tz and rotation angles angleX, angleY, -anglez which 
define the transformation of the current node to the coordinate frame 
of the parent node. 
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Constructors 


GroupNode() 


Description: 
The class contains only one no-argument constructor that initializes 
the parent node to NULL and the transformation parameters to zeros. 


Add/remove child 
void addChild(GroupNode* node); 
void removeChild(GroupNode* node); 
Description: 

The method addChild includes the specified node as a child 
node of the current node. The method removeChild removes the 
specified node, if it exists, from the list of children of the current 
node. 


Node transformation 
void translate(float tx, float ty, float tz); 
void rotateX(float angle); 
void rotateY(float angle); 
void rotateZ(float angle); 


Description: 
The above methods set the transformation parameters of the current 


node. The node transformation is always assumed to be of the form 
TR. 


Inverse transformation 


void inverseTransform() const; 


Description: 
This method uses OpenGL functions to push the matrices for the 
inverse transformation (TR)! = R^! T^! of the current node to the 
transformation stack. Note that this function does not explicitly 
generate the inverse transformation matrix. 


Scene rendering 
void render(); 
virtual void draw(); 
Description: 

This method gets the singleton object of the CameraNode, sets up 
the view transformation matrix and calls the method draw. A scene is 
rendered by invoking this method on the root node.The draw method 
is not directly invoked by the application. It is indirectly invoked 
on a group node via the method render. The draw method uses 
OpenGL functions to push the current node's transformation matrix to 
the transformation stack, and recursively calls itself on all child nodes. 
This polymorphic method causes objects to be drawn when invoked on 
leaf nodes. 
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Parent node 


GroupNode* getParent() const; 


Description: 
This getter method returns the reference to the parent of the current 
node. 


B.2 ObjectNode Class 


The Obj ectNode class is a subclass of GroupNode. 


Fields 
public: 
enum ObjType 

( CUBE, SPHERE, TORUS, TEAPOT, CONE, TETRAHEDRON); 
private: 

ObjType object; 

float  scaleX,  scaleY,  scaleZ; 

float colorR, _colorG,  colorB; 


Description: 
The enumerated type Obj Type defines a collection of GLUT objects 
which users can specify in the constructor to display an object. At the 
time of construction, the user can specify its scale factors scaleX, 
-8caleY,.scaleZ,and also its material colour using the normalized 
values in the range [0, 1] for colorR, colorG, colorB. 


Constructors 
public: 

Obj ectNode () : GroupNode(), object (CUBE), 
 ScaleX(1.0f), scaleY(1.0f), _scaleZ(1.0f), 
_colorR(1.0f),  colorG(1.0f), _colorB(1.0f) {} 

Description: 
The constructor initializes the object type to CUBE, the scale factors 
to 1, and the object material colour to white. 


Setter methods 


void setObject (ObjType object, 
float scaleX, float scaleY, float scaleZ); 
void setColor(float colorR, float colorG, float colorB); 


Description: 
The method setObject is used to change the parameters of the 
current object, including its type and scale factors. The set Color 
method modifies the material colour of the current object. 
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B.3 CameraNode Class 


The CameraNode class is a subclass of GroupNode. 


Fields 


private: 


float fov, aspect, near, far; 
static bool flag; 


Description: 


The data members £ov, aspect, near, far define the perspec- 
tive view frustum of the camera in terms of the field of view, aspect 
ratio, near plane distance and the far plane distance. The Boolean 
variable f lag ensures that at most one instance of the class is created. 


Constructors 


private: 


CameraNode(): GroupNode(), _fov(60.0f), _aspect(1.0f), 


Description: 


 near(1.0f), far(1000.0f) {} 


The CameraNode class is a singleton class with a private con- 
structor. The only instance of the class is available through the static 
method get Instance (). By default, the camera view frustum has 
60? field of view, aspect ratio 1, near plane distance 1, and far plane 


distance 1,000. 
Setter method 


void perspective 


Description: 


(float fov, float aspect, float near, float far); 


This setter method allows you to change the default frustum parame- 
ters of the camera object. 


View transformation and projection 


void viewTransform() const; 
void projection() const; 


Description: 


The viewTransform method traverses the scene graph from the 
camera node towards the root node, and pushes the inverse transfor- 
mation matrices of each node onto the transformation stack using 
OpenGL functions. The method calls the inverseTransform 
method of the GroupNode class for this operation. 

The method projection sets up the projection matrix using 
OpenGL functions. Both the above methods are not usually invoked 
directly by the user. The render method of the GroupNode class 
invokes both the methods to set up the view and projective 


transformations for the rendering pipeline. 
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B.4 LightNode Class 


The LightNode class is a subclass of GroupNode. 


Fields 
private: 
int glLight; 
Description: 
This integer field can be assigned a value between O and 7. A 
value i corresponds to the named light source GL.LIGHTi defined in 
OpenGL. 
Constructors 
public: 
LightNode(int glLight): GroupNode(), glLight(glLight) {} 
Description: 
The constructor specifies the index of the OpenGL light source 
to be used for the current object of the LightNode class. The 
default position of the light node is (0, 0, 0). The position can be 
changed by specifying transformation parameters for the node using 
the translate method. Note that all other light source parameters 
will have to be defined separately by the user with the help of OpenGL 
functions. 
Setter Method 
void setLight (int glLight); 
Description: 


This setter method allows the user to change the current light source 
used by the object. 


Appendix C: Vertex Skinning Classes 


This section gives an outline of the methods in the SkinnedMesh, Skeleton and 
SkeletonNode classes used for vertex skinning. A description of these classes 
can be found in Sect. 4.8. A class diagram showing the relationships between the 
classes is given in Fig. C.1. 


K> Point 4----------- aroei i 
Skinned i ' 


Mesh 
«« ———» Skeleton« — ——» Skeleton 
1 Node 2 


Fig. C.1 Relationships between the classes used for vertex skinning 


C.1 SkeletonNode Class 


The structure of the SkeletonNode class is similar to that of the GroupNode 
class. 


Fields 
private: 
list«SkeletonNode*» children; 
int firstIndex,  lastIndex,  parentIndex; 


SkeletonNode* parent; 
float tx, ty, tz, _angleX, _angleY, _angleZ; 


Matrix * matrix, * invMatrix; 
Description: 
Every skeleton node is implicitly a group node, and can store refer- 
ences to a number of children in the list children. A skeleton node 
represents a bone. It also stores a pair of indices £irstIndex and 
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-lastIndex defining a range of mesh vertices that are attached to 
the bone. The translation parameters are stored in variables tx, ty, 
-tz, and the Euler angles in angleX, angleY, -angleZ. The 
overall transformation matrix and its inverse are updated whenever 
any of the joint angles is changed. Each node is assigned a unique 
index starting form 1. The index 0 is reserved for the root node which 
represents the origin of the world coordinate system. The parent index 
-parentIndex establishes the link between the current node and its 
parent node. 


Constructors 
public: 

SkeletonNode(int parentIndx, 
float tx, float ty, float tz, 
int firstIndx, int lastIndx) 

: parent (NULL), 


_tx(tx), ty(ty), _tz(tz), 
 angleX(0.0), _angleY(0.0), anglez(0.0), 
_firstIndex(firstindx), _lastIndex(lastIndx), 


_parent Index (parent Indx) 
{ _matrix = new Matrix(); 
_invMatrix = new Matrix(); 
updateMatrices(); ) 


Description: 
The non-default constructor uses the parameters read in from the input 
file to initialise each node. Note that each node contains two instances 
of the matrix class. Both the transformation matrix and its inverse are 
updated using the input parameters. There is also a default constructor 
that initializes all transformation parameters to 0. 


Add/remove child: 
void addChild(SkeletonNode* node); 
void removeChild(SkeletonNode* node); 


Description: 
These methods are exactly the same as the corresponding methods in 
the GroupNode class (Appendix B). 


Bone transformations: translation 
void translate(float tx, float ty, float tz); 


Description: 
The translation parameters of the bones are set at the time of con- 
struction, and do not change afterwards. Only the translation of the 
base node (with respect to the world coordinate frame) is defined in 
the animation phase. This method is therefore usually invoked by the 
translateBase method of the Skeleton class. 
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Bone transformations: rotation 


void rotateX(float angle); 
void rotateY(float angle); 
void rotateZ(float angle); 
Description: 
These methods are used to set the rotation angle(s) of a bone during the 
animation phase. The methods are normally invoked by the rotate 
method of the Skeleton class. 


Setter methods 
void attachVertices(int firstIndx, int lastIndx); 
void setParentIndex(int parentIndx); 


Description: 
These methods alter the vertex indices and the parent index of the 
current bone. 
Getter methods 


int getParentIndex() const; 
int getFirstIndex() const; 
int getLastIndex() const; 


Matrix* getMatrix() const; 
Matrix* getInverseMatrix() const; 
SkeletonNode* getParent() const; 


int getChildCount() const; 


Description: 
These methods allow you to examine the transformation matrices, 
vertex indices, the parent index and the number of children of the 
current node. 


Transformation update 
void updateMatrices(); 


Description: 


This method updates the transformation matrix and its inverse, and is 
invoked whenever any of the transformation parameters is changed. 


Pre-processing phase 


vector<Point3*> preprocessPhase(vector«Point3*» vertices); 
void transforml 
(vector«Point3*» vertices, float tx, float ty, float tz); 


Description: 
The pre-processing phase builds the product matrix given in Eq. 4.9 
and transforms the mesh vertex list to create a new list of vertices V’. 
The method preprocessPhase returns this new vertex list. The 
method in turn invokes transform1 which traverses the skeleton 
tree from the root, visits every node, combines the inverse translation 
components, and applies the transformation on the node's vertex list. 
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Animation phase 


vector«Point3*» animationPhase(vector«Point3*» vertices); 
void transform2(vector«Point3*» vertices, Matrix matrix); 


Description: 

In the animation phase, the updated matrices incorporating joint 
angle rotations are gathered in the form of a product matrix given 
in Eq. 4.10. The vertex list obtained from the pre-processing phase 
is transformed using the matrix. The transformed vertex coordinates 
returned by animationPhase are used for rendering the mesh 
for that particular frame. This method invokes transform2 which 
traverses the skeleton tree from the root, post-multiplies the product 
matrix with the matrix at the current node, and transforms the node's 
vertices obtained from the pre-processing phase. 


C.2 Skeleton Class 


Fields 
private: 
SkeletonNode* root; 
SkeletonNode* pase; 
vector«SkeletonNode*» bones; 
Description: 
Each skeleton tree is referenced by its root node, stored in the 
variable root. This node is created by the constructor. The base node 
(-base) is a special node in the skeleton tree that has the root node 
as its parent. The transformations of the base node define the position 
and the orientation of the entire mesh in the world coordinate frame. 
The class also maintains a list of references to the skeleton nodes as 
they are created by the loadSkeleton method. 
Constructors 
public: 
Skeleton() 
: root( new SkeletonNode() ) {} 
Description: 


The constructor creates the root node of the skeleton and initializes it 
with the default transformation parameters. 


Getter method 
SkeletonNode* getRoot() const; 


Description: 
The getter method returns the reference to the root node. 
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Loading skeleton data 


void loadSkeleton(const string& filename); 


Description: 
This method loads skeleton data from a file formatted as shown in 
Fig. 4.18, and creates an instance of the SkeletonNode class for 
each bone. The method also calls attachBones that creates the 
hierarchical relationships between nodes (bones). 


Bone transformations 
void rotate(int i, 
float angleX, float angleY, float angleZ); 
void translateBase(float tx, float ty, float tz); 


Description: 

The translation parameters specifying the spatial offsets of each 
bone relative to its parent are assigned to the nodes through the 
constructor. These parameters are used for transforming vertices in 
the pre-processing phase. In the animation phase, only the joint 
angle rotations and the translation of the base node can change. 
The rotate method specifies the joint angles of the ith bone. The 
translateBase method changes the translation parameters of the 
base node. These two methods are usually called within the display 
loop of the application. 


C.3 SkinnedMesh Class 


Fields 


private: 
vector<Point3*>  verticesV; 
vector«Point3*»  verticesVT; 
vector«Point3*»  verticesW; 
vector<Polygon*> polygons; 
PolyType _polytype; 
float colorR, _colorG, _colorB; 
Skeleton* skeleton; 

public: 

enum PolyType (TRIANGLE, QUAD); 


Description: The vertex lists verticesV, .verticesVT, .verticesW 
represent the lists V, V', W shown in Fig. 4.11. The lists contain the 
mesh coordinates in the bind pose, after the pre-processing phase, and 
after the animation phase respectively. The polygon list polygons 
store the vertex indices of the mesh polygons. For the sake of 
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simplicity, each mesh is assigned a single material colour given by 
-COlorR, .colorG, .colorB. A mesh can be either a triangular 
or a quad mesh. The variable skeleton stores the reference to the 
skeleton associated with the mesh. 


Constructors 
"publie: ———0000000000000 T 
SkinnedMesh(PolyType polytype) 
: polytype(polytype), _skeleton (NULL) {} 

Description: 
The constructor specifies only the polygon type of the mesh using 
the enumerated types TRIANGLE, and QUAD. Mesh data is loaded 
using loadMesh method. The application must also load skeleton 
data using an instance of the Skeleton class, and attach the skeleton 
object using the attachSkeleton method. 


Loading a mesh 


void loadMesh(const string& filename); 


Description: 
This method loads mesh data from a file formatted as shown in Fig. 
4.19. The number of vertices per polygon in the file should match the 
polygon type provided to the constructor. The method also populates 
the vertex lists verticesVand verticesW with the initial vertex 
coordinates obtained from the file. The polygon list polygons is 
also populated with polygon data. 


Getter method 
Skeleton* getSkeleton() const; 


Description: 


The getter method returns the reference to the skeleton object attached 
to the current SkinnedMesh object. 


Setting mesh colour 


void setColor(float colorR, float colorG, float colorB); 


Description: 
'The method sets a material colour for the entire mesh. 


Attaching a skeleton 


void attachSkeleton(Skeleton* skeleton); 


Description: 
This method associates a skeleton object with the current mesh. The 
pre-processing of mesh vertices V to obtain an intermediate set of 
vertices V' (Eq. 4.9) is also initiated at this stage. 
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Rendering a mesh 


void render(); 


Description: 


This method is usually called inside the display loop of the application 
for redrawing the mesh with the updated joint angle configuration. 
Typically, this method is called after specifying the bone transforma- 
tions using the rotate method of the Skeleton class. 


Appendix D: Quaternion Classes 


This section gives an outline of methods in the classes that represent quaternion 
and dual quaternion numbers. Figure D.1 shows the static relationships between the 
classes and the geometry classes. 


Fig. D.1 Relationships — . — ; u 
between the quaternion i >| Point3 j+ --7 
classes and the geometry ! ! 
classes ! 
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D.1 Quaternion Class 


Fields 


“privater :):]ccNNNUUIP 
Matrix* _mat; 
static float RADTODEG; 
static float DEGTORAD; 
static float EPS; 
public: 
float q0, ql, q2, q3; 


Description: 
Every quaternion object has an associated 4 x 4 transformation ma- 
trix mat. The matrix elements are not automatically updated. The 
user needs to call updateMatrix to compute the values of the 
matrix elements. The constants RADTODEG and DEGTORAD store 
the conversion factors from radians to degrees and degrees to radians 
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respectively. The quaternion components .q0, -q1, -q2, -q3 are 
declared as public as they are frequently accessed. EPS stores the 
constant value 1.E-6 used as a threshold for checking if a value is 
close to zero. 


Constructors 
public: 
Quat (float q0, float gl, float q2, float q3); 
Quat (const Point3* p); 
Quat (float angle, const Vec3* axis); 
Quat () 
Description: 


The first constructor initializes an object with four quaternion com- 
ponents. The second constructor takes a point P = (x, y, z) as the 
argument, and forms the pure quaternion (0, x, y, z). The third 
constructor forms a unit quaternion using the angle and axis of a three- 
dimensional rotation as parameters. The quaternion is constructed 
as per Eq. 5.44. The fourth no-argument constructor initializes the 
quaternion components to (1, 0, 0, 0). 


Getter methods 
Matrix* getMatrix() const; 
Point3* getPoint() const; 
float getAngle() const; 
Vec3* getAxis() const; 
Vec3* getEuler() const; 
Description: 


The first getter method given above returns the current matrix _mat. 
The second getter method returns the last three components _q1, 
-q2, -q3 of the current quaternion as a point. The third and fourth 
getter methods return respectively the angle and axis of the equivalent 
rotation given by Eqs. 5.45 and 5.46. The method get Euler extracts 
the Euler angles from the quaternion components using Eq. 5.56. 


Quaternion operations 


Quat* add(const Quat* q) const; 
Quat* subtract(const Quat* q) const; 
Quat* mult(const Quat* q) const; 
Quat* scalarMult(float term) const; 
Quat* conjugate() const; 

Quat* negate() const; 


Description: 


The methods listed above perform algebraic operations of addition, 
subtraction, multiplication, scalar multiplication, conjugation and 
negation, and return the resulting quaternion. 
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Quaternion norm 


float norm() const; 
Description: 
The above method returns the magnitude of the current quaternion 
(Eq. 5.17). 


Quaternion matrix 


void updateMatrix(); 
Description: 
Each quaternion object has an associated transformation matrix as 
given in Eq. 5.23. The above method must be called whenever a 
quaternion component has changed, in order to update this matrix. 


Quaternion transformation 


Point3* transform(const Point3* point); 


Description: 
The above method transforms a point using the current quaternion 
according to the formula P’ = QPQ*. 


Conversion to unit quaternion 


void normalize(); 
Description: 
The method normalize converts the current quaternion to a unit 
quaternion. 


Quaternion interpolation 
Quat* lerp(float t, Quat* q); 
Quat* slerp(float t, Quat* q); 


Description: 
The above methods perform linear (lerp) and spherical linear 
(slerp) interpolations between the current quaternion and the sup- 
plied quaternion q, and return an intermediate quaternion for the 
parameter value given by t. 
Output 


void print(); 
Description: 


The above method prints the component values of the current quater- 
nion. 
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D.2 Dual Quaternion Class 


Fields 


private: 
Quaternion * quatl, * quat2; 


Description: 
Each dual quaternion is composed using two quaternions _quatl, 
-quat2 as described in Sect. 5.9.2. 


Constructors 


DualQuat (const Quat* quati, const Quat* quat2); 
DualQuat(float angle, const Vec3* axis, const Vec3* trans); 
DualQuat(const Point3* p); 


Description: 
The first constructor shown above forms a dual quaternion using 
two quaternion components. The second constructor using the rigid- 
body transformation parameters (angle and axis of rotation, and 
translation vector) to construct the equivalent dual quaternion. The 
third constructor creates the dual quaternion (1, 0, 0, 0, 0, x, y, z) using 
the coordinates (x, y, z) of the specified point. 


Getter methods 

Quat* getQuatl() const; 

Quat* getQuat2() const; 

Point3* getPoint() const; 

Description: 

The first two methods shown above return respectively the first and 
the second quaternion components of the current dual quaternion. The 
third method returns the last three elements (of the second quaternion 
component) as the coordinates of a point. 


Product of two dual quaternions 


DualQuat* mult(const DualQuat* q) const; 
Description: 
The above method returns the product of the current dual quaternion 
and the specified dual quaternion (q). The product is computed using 
the formula in Eq. 5.85. 


Product of a dual quaternion and a quaternion 
DualQuat* multQuat (const Quat* q) const; 
Description: 
The above method returns the product of the current dual quaternion 
and the specified quaternion (q). The product is computed using the 
formula in Eq. 5.86. 
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Dual quaternion transformation 


Point3* transform(const Point3* point); 


Description: 
The above method transforms a point using the current quaternion 
according to the formula in Eq. 5.97. 


Output 


void print(); 


Description: 
The above method prints the component values of the current dual 
quaternion. 


Index 


A 
AABB. See Axis aligned bounding box 
(AABB) 
Adjacency queries, 186 
Affine transform, 19 
Agglomerative clustering, 259 
Algorithm 
circular alignment, 132 
closest point, 270 
cyclic coordinate descent, 130 
de-Casteljau, 154 
Graham Scan, 239 
incremental hull, 241 
Newton-Raphson, 133 
rotating calipers, 238 
three-coins, 219 
Welzl’s, 233 
Angle-axis transformation, 86 
equivalent angle, 88 
equivalent axis, 88 
interpolation, 97 
matrix equation, 87 
non-uniform motion, 97 
vector equation, 87 
Angle between vectors, 7 
Angle-optimal, triangulation, 218 
Angular velocity, 119 
Approximating curve, 163 
Articulated character mode, 35 
Average plane, 195 
Axis aligned bounding box (AABB), 232 


B 

Ball and socket joint, 114 
Barycentre, 22 

Barycentric coordinates, 22 


Barycentric embedding, 210 
Basis functions, 160 
Bernstein polynomials, 20 
Bezier basis, conversion to, 153 
Bezier curve, 21, 55, 151, 154, 165, 
170 
as a B-spline curve, 165 
rational, 156 
Bezier polynomials 
cubic, 55, 151 
geometrical interpretation, 154 
quadratic, 154 
Bilinear interpolation, 21 
Binary space partitioning (BSP) tree, 
267 
Bind pose, 60 
Blending functions, 148, 169 
Blending polynomials, 144 
Blinn's approximation, 26 
Bounding interval hierarchy, 271 
Bounding volume hierarchies, 41, 257 
using AABB, 258 
bottom-up design, 259 
cost function, 262 
top-down construction, 258 
traversal, 261 
Bounding volume intersection 
AABB-AABB, 245 
kDOP-kDOP, 253 
plane-kDOP, 253 
plane-OBB, 247 
plane-sphere, 252 
ray-AABB, 243 
ray-kDOP, 253 
ray-OBB, 247 
ray-sphere, 251 
sphere-sphere, 252 
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Bounding volumes 
AABB, 232, 260 
convex hull, 241 
k-DOP, 239, 260 
merging, 260 
minimal, 41 
multi-scale representation, 257 
OBB, 237 
sphere, 233, 260 
B-splines, 159, 163 


BSP tree. See Binary space partitioning (BSP) 


tree 


C 
Candy-wrapper effect, 66 
Cardinal splines, 149 
Catmull-Clark subdivision, 205 
Catmull-Rom spline, 150 
CCD. See Cyclic coordinate descent (CCD) 
Circle through three points, 24 
Circular alignment algorithm, 132 
Collapsing elbow effect, 66 
Collinearity of points, 12 
Collision detection 
broad-phase, 263 
narrow phase, 263 
Collision testing, 260 
Compatible faces, 186 
Complex numbers 
addition, 77 
conjugate, 78 
multiplication, 77 
multiplicative inverse, 78 
orthogonal basis for, 77 
representation of, 77 
as rotation operators, 78 
subtraction, 77 
tuple notation, 77 
Conjugate transformation, 19 
Continuity constraints, 146 
Convex combination of points, 20, 173 
Convex hull, 241 
Convex polygon, 196, 241 
Coons patch, 170 
Coplanarity of four points, 12 
Coplanar vectors, 13 
Cost function, edge collapse operation, 199 
Covariance matrix, 237 
Cox de Boor formula, 160 
Cubic polynomials, 141 
Curvature, 16 
Curve 
approximating, 139 
bi-normal vector, 17 


Index 


interpolating, 139 
normal direction at a point, 16 
normal plane, 17 
orthonormal basis at a point, 16 
osculating plane, 17 
tangent vector, 16 
tension, 157, 158 
torsion, 17 

Cyclic coordinate descent (CCD), 130 
drawbacks, 131 


D 
DCEL. See Doubly Connected Edge List 
(DCEL) 
De-Casteljau’s method, 154, 157 
Delaunay triangulation, 218 
Diffuse reflection, 25 
Direction cosines, 9 
Discrete harmonic metric, 212 
Discrete oriented polytope, 239 
Doubly Connected Edge List (DCEL), 191 
Dual numbers, 104 
algebra of, 104 
conjugate, 105 
multiplication rule, 105 
multiplicative inverse, 105 
square-root, 105 
Dual quaternion, 104, 105 
basis, 106 
conjugates, 107 
multiplication table, 106 
product, 105 
rigid-body transformation, 108 
transformations using, 108 
unit, 108 


E 
Edge-based data structure 
half-edge, 190 
winged-edge, 188 
Edge collapse operation, 196 
Edge flip operation, 208, 218 
Edge, silhouette, 242 
Edge-visible polygon, 217 
End effector, 113 
linear velocity, 120 
Error metric 
for edge collapse, 199 
quadric, 199 
for vertex decimation, 196 
Euler angles 
angular velocity vector using, 120 


Index 


interpolation, 96 
proper, 84 
from quaternions, 92 
sequence, 84 
transformation matrix, 84 
Euler characteristic, 186 
Euler-Poincare formula, 185 
Euler's formula, 101, 186 
Euler's theorem of rotations, 84 
Exponential function for quaternions, 
102 
Extraordinary vertices, 203 
Extrinsic composition of rotations, 85 


F 

Face-based data structure, 186 
First-person view, 47 

Forward kinematics, 115 
Frenet frame, 17 


G 

Gauss-Seidel iteration, 213, 214 
Geometric continuity, 146 
Gradient descent, 128 

Graham Scan algorithm, 242 


H 

Half-edge data structure, 190 
Half-way vector, 25 

Hermite interpolation, 57 

Hermite polynomials, 148, 169, 171 
Hermite splines, 147 
Homogeneous coordinates, 5 
Horner's method, 141 


I 
Incremental hull algorithm, 241 
Interpolating curve, 139 
Interpolating patch, 168 
Interpolation 

basis matrix for, 144 

Euler angle, 96 

Hermite, 57 

linear, 20 

quaternion, 98 

trigonometric, 20 
Intrinsic composition of rotations, 85 
Inverse kinematics, 124 

using circular alignment algorithm, 

132 
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using cyclic coordinate descent, 
130 

using gradient descent, 128 

using Jacobian inverse, 127 

2-link, 125 

n-link, 126 


J 
Jacobian matrix, 124, 127 
inverse, 127 
Jacobi method, 213 
Joint 
Hooke's, 114 
prismatic, 114 
revolute, 114 
spherical, 114 
Joint chains, 57 
planar, 115 
scene graph representation, 118 
transformations, 116, 117 


K 
k-DOP, 239 
K-d tree, 267 
closest point algorithm, 270 
for ray tracing, 260 
sequential traversal, 268 
three-dimensional, 267 
Keyframe animation, 66 
Kinematics, 113 
forward, 115 
inverse, 124 
Knot points, 164 
Knots, 145 
Knot vector, 160 
clamped, 165 
multiplicity, 166 


L 
Lagrange polynomials, 140 
Lambertian reflectance, 25 
LCA. See Lowest Common Ancestor (LCA) 
Left pseudo-inverse, 127 
Line 
equation in standard form, 11 
shortest distance to, 12 
Linear transformations, 19 
Linear velocity, 119 
Logarithm of unit quaternion, 101 
Loop subdivision, 203 
Lowest Common Ancestor (LCA), 39 
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M 
Matrix 
angle axis transformation, 87 
covariance, 237 
Euler angle transformation, 84 
Jacobian, 124 
model transformation, 38 
model-view, 38 
quaternion, 79 
quaternion transformation, 82 
Vandermonde, 141 
Mean value metric, 212 
Mesh 
closed manifold, 214 
data structures, 186 
manifold, 184 
non-manifold, 184 
parameterization, 209 
regular, 186 
representation, 179 
simplification, 194 
subdivision, 201, 206 
Mesh file format, 180 
OBJ, 180 
OFF, 182 
PLY, 182, 183 
Mesh vertex transformation, 60 
Minimal bounding sphere, 233 
Minimum energy configuration, 210 
Mobius strip, 186 
Model transformation matrix, 38 
Model-view matrix, 38 
Monotone polygonal chain, 217 


N 
Nearest neighbor algorithm, 270 
Newton-Raphson method, 133 

Node 

base, 59 

root, 59 

Non-manifold, mesh, 184 
Non-uniform rational basis spline, 166 
Normal plane, 17 


(0) 
OBB. See Oriented bounding box (OBB) 
Octree, 263 
index representation, 264 
top-down traversal, 265 
One-ring neighbourhood, 185, 211 
traversal, 187 
Orientable mesh, 185 
Orientation of 3 points, 11 


Oriented bounding box (OBB), 237 
projected distances of radii, 250 
representation using three slabs, 247 

Osculating plane, 17 


P 
Parametric continuity, 145 
Perp-vector, 7 
Phong-Blinn illumination model, 26 
Pitch rotation, 85 
Planar embedding, 210 
Plane 
equation, 243 
equation using three points, 12 
intersection, 14 
normal vector, 13 
parametric equation, 14 
point-normal form, 13 
point of intersection with ray, 13 
shortest distance of a point, 13 
vector equation, 13 
Point inclusion test, 242 
Points 
addition, 6 
affine combination, 20 
collinearity, 12 
convex combination, 20 
coplanarity, 12 
linear interpolation of, 20 
subtraction, 6 
trigonometric interpolation, 20 
Point set triangulation, 215 
Polygon 
convex, 216 
edge-visible, 217 
kernel of, 216 
monotone, 217, 222 
regular, 216 
simple, 216 
star-shaped, 216, 219 
triangulation, 215 
types, 216 
weakly externally visible, 217 
Polygonal manifold, 184 
Polynomial interpolation, 156 
Polynomial interpolation theorem, 139 
Polynomials 
Bernstein, 20 
blending, 144 
cubic, 141 
evaluation using Horner’s method, 
141 
Lagrange, 140 


Index 


Polytope, 239 
Pose, 93 
Prismatic joint, 114 


Q 
QEM. See Quadric error metric (QEM) 


Quadric error metric (QEM), 199 
Quadtree, 266 
Quaternion 
using Euler angles, 91 
exponentiation, 101 
inverse, 81 
linear interpolation, 98 
logarithm, 101 
magnitude, 80 
negative, 93 
norm, 80 
orthogonal basis, 79 
product, 79 
pure, 81 
real, 81 
relative, 103 
representation of 3D rotation, 89 
scalar part, 79 
vector part, 79 
velocity, 122 
Quaternion transformation, 81 
fixed point of, 82 
inverse, 82 
matrix, 82, 90 


R 
Rational Bezier curve, 156 
Ray 

equation, 243, 269 

parametric equation, 11 
Ray tracing, using k-d tree, 269 
Real quaternion, 81 
Rectifying plane, 17 
Redundant manipulator, 126 
Reflection vector, 9 
Regular polygon, 216 
Relative transformation, 38 
Revolute joint, 114 
Robot manipulator arm, 113 
Rodrigues rotation formula, 87 
Roll rotation, 85 
Root-3 subdivision, 207 
Rotating calipers method, 238 
Rotation 

angle-axis, 86 

general three-dimensional, 84 
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pitch, 85 
quaternion, 89 
roll, 85 

yaw, 85 


S 
Scatter matrix, 237 
Scene graph 
camera node, 46 
light node, 47 
nodes, 32 
object node, 32, 45 
standard form, 38 
world node, 31 
Separating axis theorem, 248, 256 
Separating plane, 248 
Sequential binary tree search, 268 
Signed angle between vectors, 9 
Signed area, 10 
Signed distance, 13 
Silhouette edges, 242 
Simple polygon, 216 
Singular value decomposition (SVD), 
127 
Skeleton 
bone, 57 
skin, 58 
Smoothness constraints, 146 
Spatial partitioning trees, 263 
Specular reflection, 25 
Sphere 
antipodes of, 234 
minimal, 234 
Spherical coordinates, 214 
Spherical embedding, 214 
Spherical joint, 114 
Splines, 145 
basis, 159 
Bezier, 151 
cardinal, 149 
Catmull-Rom, 150 
cubic Bezier, 152 
Hermite, 147 
interpolating, 156 
segment, 144 
support of, 162 
Spring constants, 212 
Spring displacement, 210 
Standard triangle format, 182 
Star-shaped polygon, 196, 216 
Subdivision curve, 201 
Subdivision masks, 203 
Surface design, 167 
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Surface normal vector, 13 
Surface of revolution, 167 
Surface patches, 167 
bi-cubic, 169 
bi-cubic Bezier, 172 
bi-cubic coons, 171 
SVD. See Singular value decomposition 
(SVD) 


T 
Taylor's approximation, 123 
Three-Coins algorithm, 219, 242 
Torsion of a curve, 17 
Transformation 
angle-axis rotation, 86 
conjugate, 19 
dual-quaternion, 109 
Euler angle, 84 
hierarchy, 33 
quaternion, 81 
rigid-body, 83 
translation, 19 
Transformation blending, 65 
Triangle 
area, 7 
intersection with another triangle, 
254 
intersection with ray, 253 
signed area, 10 
strip, 180 
Triangular subdivision, 204 
Triangulation 
angle-optimal, 218 
Delaunay, 218 
Trilinear coordinates, 22 
Twist vector, 168 


U 

Uniform B-splines, 161 
Unit complex numbers, 78 
Upper triangular matrix, 142 


Index 


V 
Valence, 185 
Vandermonde matrix, 141, 142 
Vector 
addition, 6 
cross-product, 7 
dot-product, 7 
magnitude, 7 
normal, 8 
projections, 9 
reflection, 9 
resolving components, 9 
scalar triple product, 8 
unit, 7 
vector triple product, 8 
Velocity 
angular, 119 
Euler angle rates, 120 
Vertex 
decimation algorithm, 194 
blending, 55 
boundary, 195 
extraordinary, 203 
list, 179 
one-ring neighbourhood of, 185 
split operation, 197 
valence, 185 
View transformation matrix, 45 
Visualizing 3D rotations, 95 


W 

Wachspress metric, 212 

Walk sequence, 67 

Weakly externally visible (WEV) polygon, 217 
Welzl’s algorithm, 233 

Winged-edge data structure, 188 


X 
X-monotone polygons, 222 


Y 
Yaw rotation, 85 


